This document describes the RESTful interface to the Global Data Plane (GDP). Note that this is not part of the GDP itself, since that layer treats all records as opaque. This level adds simple key-value structure to the data for the convenience of users, and in some cases assumes the data is valid JSON text.
This interface is a true RESTful interface, that is, it uses GET/POST/PUT/DELETE instead of overloading everything into GET. This implies that requests cannot be submitted using only an off the shelf web browser because browsers will only use POST methods in response to an HTML form.
The data representation chosen is JSON, which is easier to build and
parse than XML. [[This should be expanded to include XML as
specified by the Accept header.]]
All RESTful calls have the prefix "/gdp/v1
" to allow
overloading with other services and to allow multiple versions of the
protocol to run simultaneously on one node.
Objects in the dataplane are called GDP Channel-Logs (GCLs), representing their dual nature as both a communication channel and a storage log. GCLs are named with 256-bit values encoded in Base64url notation or as a user-meaningful text string, possibly including slashes, which is hashed using SHA-256 to produce an internal value.
Everything stored in the GDP is supposed to be encrypted; however, this
interface does not enforce this policy and treats all GDP data as though
it were encoded in JSON; in particular, in some cases it may be
encapsulated inside another JSON object. Data returned in response to a
status request (e.g., for the number of records available) is always
returned unencrypted as a JSON object.
A way around this assumption of JSON data content is to return all
data as base64-encoded strings. However, this puts more work on
the client.
Although the data portion of the GDP transaction may be encrypted, an HTTP request/response exchange is not, so it is highly recommended that clients authenticate and communicate with a RESTful server using HTTPS only. Depending on writer-platform capabilities, basic or htdigest client authentication within an encrypted session may be useful to gate resource consumption and log access within an organization's larger security perimeter.
In general all GET calls must include "Accept: application/json". Is
this true?
The symbol <gcl_name> can be either a 43-character base64-encoded
value or a string composed of characters legal in the path portion of a
URL (including slashes), either of which is converted to a 256-bit
location-independent GCL name.
The symbol <gcl_id> refers only to the base64-encoded name of a
GCL.
In all cases, timestamps are in ISO 8601 format (YYYY-MM-DDTHH:MM:SS.SSSSSSSSSZ, including nanoseconds, and in UTC) possibly followed by a slash ("/") and a time accuracy in seconds; for example "2014-11-22T13:03:09.395023619/3.5" (you can read the slash as "±"). Record numbers may be as large as 263 – 1 (18,446,744,073,709,551,615).
200 OK
Content-Type: application/json; type=gdp/gcllist
[ <gcl_id>, ...
]
401 Unauthorized
Returns a list of GCL names encoded as base64 strings.
[[Should include some way of limiting the number of values, e.g., "?start=<#>&max=<#>", since this could be a very large list. It would probably be more useful if we could include some query, e.g., only return GCLs where the metadata matches some pattern, but that would require a change in the GDP protocol itself.]]
Not implemented at this time, and is unlikely to be implemented in this form in the future. This requires display of the global state of the GDP, which is likely to be immense. A possible alternative implementation would take a query that would be passed to a directory service.
Returns 405 Method Not Allowed.
200 OK
Content-Type: application/json; type=gdp/gcldesc
{ "gcl_id": <gcl_id>,
"nrecs": <integer>,
... }
The gcl_id will be the internal name of the GCL.
401 Unauthorized
404 Not Found
Returns information about the created GCL encoded as a JSON object. The information should at a minimum include the canonical name of the GCL and the number of records. When GCL metadata is present it should be returned as well.
Content-Type: application/json;
{ "external-name": null; , optional gcl-create parameters }
The optional gcl-create parameters are supplied as zero or more comma delimited JSON lines. For example:
Content-Type: application/json;
{ "external-name": null, "-C": "creator_name@berkeley.edu" }
RESTful server supported parameter keys are "-C", "-h", "-k", "-b", "-c", and "-s", and should be paired with a parameter value which is valid for the selected parameter key. Consult the gcl-create documentation or man page for detailed parameter help. Metadata may be optionally specified, using the RESTful-unique "META" key, paired with a value which is a JSON array of one or more comma separated "<metadataname>=<metadata>" strings as array elements.
201 Created
Content-Type: application/json;
{ "gcl_name": <gcl_id>,
"gdplogd_name": <gdplogd_name> }
400 Bad Request
Content-Type: application/json;
{ "detail": <detail> }
409 Conflict
Content-Type: application/json;
{ "detail": "generated-name conflict on gdplogd server" }
500 Internal Server Error
Content-Type: application/json;
{ "detail": <detail>
[, optional diagnostic key/value pairs] }
Creates a new GCL under an internally-assigned name and returns the information about that GCL in the same format as that delivered by the status query.
The plan is that GCLs will be created using a log creation service that will deal with resource allocation. This command is likely to be an interface to that service.
Content-Type: application/json;
{ "external-name": <user_selected_name> , optional gcl-create parameters }
The optional gcl-create parameters are supplied as zero or more comma delimited JSON lines. For example:
Content-Type: application/json;
{ "external-name": "edu.berkeley.eecs.swarmlab.test.log00", "-C": "creator_name@berkeley.edu" }
RESTful server supported parameter keys are "-C", "-h", "-k", "-b", "-c", and "-s", and should be paired with a parameter value which is valid for the selected parameter key. Consult the gcl-create documentation or man page for detailed parameter help. Metadata may be optionally specified, using the RESTful-unique "META" key, paired with a value which is a JSON array of one or more comma separated "<metadataname>=<metadata>" strings as array elements.
201 Created
Content-Type: application/json;
{ "gcl_name": <gcl_id>,
"gdplogd_name": <gdplogd_name> }
400 Bad Request
Content-Type: application/json;
{ "detail": <detail> }
409 Conflict
Content-Type: application/json;
{ "detail": "external-name already exists on gdplogd server" }
500 Internal Server Error
Content-Type: application/json;
{ "detail": <detail>
[, optional diagnostic key/value pairs] }
Creates a new GCL under a user-assigned name and returns the information about the creation of that GCL, or reports a conflict if the name already exists.
Content-Type: application/json;
{ "external-name": <user_selected_name> }
204 No Content
401 Unauthorized
404 Not Found
Currently the GDP does not support log deletion. Until it does this will not be implemented.
201 Created
Content-Type: application/json; type=gdp/response
{ "gcl_id": <gcl_id>, "recno": <integer>,
"timestamp": <commit timestamp>
... }
401 Unauthorized
404 Not Found
Adds a record to the named GCL. The information returned shows the
GDP-assigned metadata associated with the new record.
200 OK
Content-Type: <as specified as metadata during GCL creation> GDP-Record-Number: <recno>
GDP-Commit-Timestamp: <timestamp>
<opaque data as written by POST>
404 Not Found
Returns the contents of the record indicated by recno. As a
special case, if recno is the text "last" it returns the last (most
recently written) record.
Note that the metadata is included in the response header, not in the
data itself, in order to maintain the opacity of that data. Question:
should we move the metadata into the header for other commands as well
to maintain symmetry?
This call is not orthogonal to the others because it does not assume that
the data is application/json.
[404 Not Found
{
"recno": <integer>,
"timestamp": <timestamp>,
"value": <record value>
},
...
]
Returns a sequence of up to nrecs records starting from recno encoded as an array of JSON objects. If nrecs is zero, all data from recno to the end is returned. The recno parameter is optional and defaults to 1.
The nrecs parameter is not implemented at this time.
200 OK
401 Unauthorized
404 Not Found
408 Request Timeout
If the indicated GCL does not exist, the GET returns immediately with a 404 error code. Otherwise, if the indicated record number exists, the result is exactly the same as the previous case. If the indicated record number does not exist, this call waits for up to the indicated timeout for that record to appear; if it does, the record is returned in the usual way, otherwise it returns with a 408 Request Timeout response. If the starting record number is not specified, it starts from the beginning.
The nrecs and timeout parameters are not implemented at this time.
[[Perhaps this should not take recno and nrecs, and just return the
next record that appears at the GCL.]]
[[There is some debate about whether this is the correct interface
rather than falling back to a non-REST interface (such as WebSockets)
for subscriptions. HTTP (and hence REST) isn’t designed to handle
spontaneous server to client messages.]]
To be determined. Probably will create a JSON object including the
specified arguments and append that to the GCL, unencrypted of course.
The key-value store is implemented as a single GCL that must be formatted as unencrypted JSON data structured as a series of JSON objects. The "keys" are the field names in that top level object. When adding data to the KV store an arbitrary number of values may be sent in any one record. When retrieving data the key name is specified and the most recent value corresponding to that key is returned.
The GCL used to implement is named "swarm.rest.kvstore.gcl
".
It may be overridden with the "swarm.rest.kvstore.gclname
"
administrative runtime parameter.
Note that another way of implementing a KV store is to have an arbitrary GCL, the name of which represents the key, and just get the last (most recent) value to get the current value for that key. This trades off a potentially large key space for efficiency and lack of clutter.
There is a single key-value store. All data must be formatted as JSON objects. POST adds all of the names inside the content to the KV store. For example a POST with the contents { "a": 1, "b": 2 } adds two values to the store. A subsequent POST with contents { "a": 3, "c": 4 } updates the value of a, adds a new value c, and leaves b unchanged.
GET /gdp/v1/kv/<key>
Returns the value of <key> in JSON notation.
This is just a quick recap of the material presented above. All
URLs begin with "/gdp/v1/
".
Scope |
Method |
URI Path |
Description |
GDP |
GET |
gcl |
Lists existing GCLs [405 Method Not Allowed] |
GDP |
POST |
gcl |
Create new GCL with random name |
GDP |
PUT |
gcl |
Create new GCL with specified name |
GDP |
DELETE |
gcl |
Delete GCL [405 Method Not Allowed] |
GCL |
GET |
gcl/<id> |
GCL <id> read metadata [405 Method Not Allowed] |
GCL |
POST | gcl/<id> |
GCL <id> append record |
GCL |
PUT |
gcl/<id>?recno=<#> |
GCL <id> append record at recno or report conflict [405 Method Not Allowed] |
GCL |
DELETE |
gcl/<id> |
GCL <id> delete record Deprecated: GCL is append-only by design |
GCL |
GET |
gcl/<id>?recno=<#> |
GCL <id> read record at recno (returns raw data) |
GCL |
GET |
gcl/<id>?recno=<#>&nrecs=<#> |
GCL <id> read nrecs records starting from recno Deprecated: violates REST standard (nrecs value is not limited). Use websocket API instead. |
GCL |
GET |
gcl/<id>?timeout=<#> |
Wait for new data to appear on GCL Deprecated: violates REST standard (subscription). Use websocket API instead. |
GCL |
GET |
put/gcl/<id>?<arguments> |
Add JSON record to GCL <id> Deprecated: violates REST standard (arguments are not limited). |
RESTful Server |
POST |
kv |
Add JSON information to key-value store |
RESTful Server |
GET |
kv/<key> |
Return JSON information associated with <key>
in the key-value store |
Do subscriptions wait until all data is ready to return, or does it return immediately as soon as any data appears?
How are signatures on POST methods handled?