RESTful Interface for the Global Data Plane

Eric Allman, U.C. Berkeley Swarm Lab

Revision Notes:
Draft 8 2017-06-02, rpratt@berkeley.edu: GDP & GCL scope separation, API revisions.

This document describes the RESTful interface to the Global Data Plane (GDP). Note that this is not part of the GDP itself, since that layer treats all records as opaque. This level adds simple key-value structure to the data for the convenience of users, and in some cases assumes the data is valid JSON text.

1 Introduction

This interface is a true RESTful interface, that is, it uses GET/POST/PUT/DELETE instead of overloading everything into GET. This implies that requests cannot be submitted using only an off the shelf web browser because browsers will only use POST methods in response to an HTML form.

The data representation chosen is JSON, which is easier to build and parse than XML.  [[This should be expanded to include XML as specified by the Accept header.]]

All RESTful calls have the prefix "/gdp/v1" to allow overloading with other services and to allow multiple versions of the protocol to run simultaneously on one node.

Objects in the dataplane are called GDP Channel-Logs (GCLs), representing their dual nature as both a communication channel and a storage log. GCLs are named with 256-bit values encoded in Base64url notation or as a user-meaningful text string, possibly including slashes, which is hashed using SHA-256 to produce an internal value.

2 Security

Everything stored in the GDP is supposed to be encrypted; however, this interface does not enforce this policy and treats all GDP data as though it were encoded in JSON; in particular, in some cases it may be encapsulated inside another JSON object. Data returned in response to a status request (e.g., for the number of records available) is always returned unencrypted as a JSON object.

A way around this assumption of JSON data content is to return all data as base64-encoded strings.  However, this puts more work on the client.

Although the data portion of the GDP transaction may be encrypted, an HTTP request/response exchange is not, so it is highly recommended that clients authenticate and communicate with a RESTful server using HTTPS only. Depending on writer-platform capabilities, basic or htdigest client authentication within an encrypted session may be useful to gate resource consumption and log access within an organization's larger security perimeter.

3 Individual Operations

In general all GET calls must include "Accept: application/json".  Is this true?

The symbol <gcl_name> can be either a 43-character base64-encoded value or a string composed of characters legal in the path portion of a URL (including slashes), either of which is converted to a 256-bit location-independent GCL name.

The symbol <gcl_id> refers only to the base64-encoded name of a GCL.

In all cases, timestamps are in ISO 8601 format (YYYY-MM-DDTHH:MM:SS.SSSSSSSSSZ, including nanoseconds, and in UTC) possibly followed by a slash ("/") and a time accuracy in seconds; for example "2014-11-22T13:03:09.395023619/3.5" (you can read the slash as "±").  Record numbers may be as large as 263 – 1 (18,446,744,073,709,551,615).

Standard GCLs


Name

GET /gdp/v1/gcl — list all known GCLs [NOT IMPLEMENTED; see Notes]

Request Body

none

Response

200 OK

    Content-Type: application/json; type=gdp/gcllist

[ <gcl_id>, ...
]

401 Unauthorized

Description

Returns a list of GCL names encoded as base64 strings.

[[Should include some way of limiting the number of values, e.g., "?start=<#>&max=<#>", since this could be a very large list.  It would probably be more useful if we could include some query, e.g., only return GCLs where the metadata matches some pattern, but that would require a change in the GDP protocol itself.]]

Notes

Not implemented at this time, and is unlikely to be implemented in this form in the future.  This requires display of the global state of the GDP, which is likely to be immense.  A possible alternative implementation would take a query that would be passed to a directory service.  Returns 405 Method Not Allowed.


Name

GET /gdp/v1/gcl/<gcl_name> — list information known about specified GCL [NOT IMPLEMENTED at this time]

Request Data

none

Response Data

200 OK

    Content-Type: application/json; type=gdp/gcldesc

{ "gcl_id": <gcl_id>,
"nrecs": <integer>,
... }

The gcl_id will be the internal name of the GCL.

401 Unauthorized

404 Not Found

Description

Returns information about the created GCL encoded as a JSON object.  The information should at a minimum include the canonical name of the GCL and the number of records.  When GCL metadata is present it should be returned as well.


Name

POST /gdp/v1/gcl — Create a new GCL with a random name 

Request Data

    Content-Type: application/json;

{ "external-name": null; , optional gcl-create parameters }

The optional gcl-create parameters are supplied as zero or more comma delimited JSON lines. For example:

    Content-Type: application/json;

{ "external-name": null, "-C": "creator_name@berkeley.edu" }

RESTful server supported parameter keys are "-C", "-h", "-k", "-b", "-c", and "-s", and should be paired with a parameter value which is valid for the selected parameter key. Consult the gcl-create documentation or man page for detailed parameter help. Metadata may be optionally specified, using the RESTful-unique "META" key, paired with a value which is a JSON array of one or more comma separated "<metadataname>=<metadata>" strings as array elements.

Response Data

201 Created

    Content-Type: application/json;

{ "gcl_name": <gcl_id>,
"gdplogd_name": <gdplogd_name> }

400 Bad Request

    Content-Type: application/json;

{ "detail": <detail> }

409 Conflict

    Content-Type: application/json;

{ "detail": "generated-name conflict on gdplogd server" }

500 Internal Server Error

    Content-Type: application/json;

{ "detail": <detail>
[, optional diagnostic key/value pairs] }

Description

Creates a new GCL under an internally-assigned name and returns the information about that GCL in the same format as that delivered by the status query.

Notes

The plan is that GCLs will be created using a log creation service that will deal with resource allocation.  This command is likely to be an interface to that service.


Name

PUT /gdp/v1/gcl — Create a new GCL with a specified name 

Request Data

    Content-Type: application/json;

{ "external-name": <user_selected_name> , optional gcl-create parameters }

The optional gcl-create parameters are supplied as zero or more comma delimited JSON lines. For example:

    Content-Type: application/json;

{ "external-name": "edu.berkeley.eecs.swarmlab.test.log00", "-C": "creator_name@berkeley.edu" }

RESTful server supported parameter keys are "-C", "-h", "-k", "-b", "-c", and "-s", and should be paired with a parameter value which is valid for the selected parameter key. Consult the gcl-create documentation or man page for detailed parameter help. Metadata may be optionally specified, using the RESTful-unique "META" key, paired with a value which is a JSON array of one or more comma separated "<metadataname>=<metadata>" strings as array elements.

Response Data

201 Created

    Content-Type: application/json;

{ "gcl_name": <gcl_id>,
"gdplogd_name": <gdplogd_name> }

400 Bad Request

    Content-Type: application/json;

{ "detail": <detail> }

409 Conflict

    Content-Type: application/json;

{ "detail": "external-name already exists on gdplogd server" }

500 Internal Server Error

    Content-Type: application/json;

{ "detail": <detail>
[, optional diagnostic key/value pairs] }

Description

Creates a new GCL under a user-assigned name and returns the information about the creation of that GCL, or reports a conflict if the name already exists.


Name

DELETE /gdp/v1/gcl — delete a GCL [NOT IMPLEMENTED; see Notes]

Request Data

    Content-Type: application/json;

{ "external-name": <user_selected_name> }

Response

204 No Content

401 Unauthorized

404 Not Found

Description

Notes

Currently the GDP does not support log deletion.  Until it does this will not be implemented.



Name

POST /gdp/v1/gcl/<gcl_name> — add a record to specified GCL

Request Data

(Opaque data to be appended, but recommended to use JSON)

Response Data

201 Created

    Content-Type: application/json; type=gdp/response

 { "gcl_id": <gcl_id>, "recno": <integer>,
"timestamp": <commit timestamp>
... }

401 Unauthorized

404 Not Found

Description

Adds a record to the named GCL.  The information returned shows the GDP-assigned metadata associated with the new record.


Name

GET /gdp/v1/gcl/<gcl_name>?recno=<#> — return a specified record

Request Data

none

Response Data

200 OK

    Content-Type: <as specified as metadata during GCL creation>
    GDP-Record-Number: <recno>
GDP-Commit-Timestamp: <timestamp>

 <opaque data as written by POST>

404 Not Found

Description

Returns the contents of the record indicated by recno.  As a special case, if recno is the text "last" it returns the last (most recently written) record.

Note that the metadata is included in the response header, not in the data itself, in order to maintain the opacity of that data. Question: should we move the metadata into the header for other commands as well to maintain symmetry?

This call is not orthogonal to the others because it does not assume that the data is application/json.


Name

GET /gdp/v1/gcl/<gcl_name>?recno=<#>&nrecs=<#> — get a series of records

Request Data

none

Response

200 OK
    [
{
"recno": <integer>,
"timestamp": <timestamp>,
"value": <record value>
},
...
]
404 Not Found

Description

Returns a sequence of up to nrecs records starting from recno encoded as an array of JSON objects.  If nrecs is zero, all data from recno to the end is returned.  The recno parameter is optional and defaults to 1.

The nrecs parameter is not implemented at this time.


Name

GET /gdp/v1/gcl/gcl_id?recno=<#>&nrecs=<#>&timeout=<seconds> — monitor a GCL

Request Data

none

Response Data

200 OK

401 Unauthorized

404 Not Found

408 Request Timeout

Description

If the indicated GCL does not exist, the GET returns immediately with a 404 error code. Otherwise, if the indicated record number exists, the result is exactly the same as the previous case. If the indicated record number does not exist, this call waits for up to the indicated timeout for that record to appear; if it does, the record is returned in the usual way, otherwise it returns with a 408 Request Timeout response. If the starting record number is not specified, it starts from the beginning.

Notes

The nrecs and timeout parameters are not implemented at this time.

[[Perhaps this should not take recno and nrecs, and just return the next record that appears at the GCL.]]

[[There is some debate about whether this is the correct interface rather than falling back to a non-REST interface (such as WebSockets) for subscriptions. HTTP (and hence REST) isn’t designed to handle spontaneous server to client messages.]]


Name

GET /gdp/v1/post/gcl/<gcl_name>?<arguments> — add data to a GCL (not REST compliant)

Description

To be determined.  Probably will create a JSON object including the specified arguments and append that to the GCL, unencrypted of course.


Key-Value Store

The key-value store is implemented as a single GCL that must be formatted as unencrypted JSON data structured as a series of JSON objects.  The "keys" are the field names in that top level object.  When adding data to the KV store an arbitrary number of values may be sent in any one record.  When retrieving data the key name is specified and the most recent value corresponding to that key is returned.

The GCL used to implement is named "swarm.rest.kvstore.gcl".  It may be overridden with the "swarm.rest.kvstore.gclname" administrative runtime parameter.

Note that another way of implementing a KV store is to have an arbitrary GCL, the name of which represents the key, and just get the last (most recent) value to get the current value for that key.  This trades off a potentially large key space for efficiency and lack of clutter.


Name

POST /gdp/v1/kv — add data to key-value store

Description

There is a single key-value store.  All data must be formatted as JSON objects.  POST adds all of the names inside the content to the KV store.  For example a POST with the contents { "a": 1, "b": 2 } adds two values to the store.  A subsequent POST with contents { "a": 3, "c": 4 } updates the value of a, adds a new value c, and leaves b unchanged.


Name

GET /gdp/v1/kv/<key>

Description

Returns the value of <key> in JSON notation.



4 Summary

This is just a quick recap of the material presented above.  All URLs begin with "/gdp/v1/".

Scope

Method

URI Path

Description

GDP
GET
gcl
Lists existing GCLs 
[405 Method Not Allowed]
GDP
POST
gcl
Create new GCL with random name
GDP
PUT
gcl
Create new GCL with specified name
GDP
DELETE
gcl
Delete GCL
[405 Method Not Allowed]
GCL
GET
gcl/<id>
GCL <id> read metadata
[405 Method Not Allowed]
GCL
POST gcl/<id>
GCL <id> append record
GCL
PUT
gcl/<id>?recno=<#>
GCL <id> append record at recno or report conflict 
[405 Method Not Allowed]
GCL
DELETE
gcl/<id>
GCL <id> delete record
Deprecated: GCL is append-only by design
GCL
GET
gcl/<id>?recno=<#>
GCL <id> read record at recno
(returns raw data)
GCL
GET
gcl/<id>?recno=<#>&nrecs=<#>
GCL <id> read nrecs records starting from recno
Deprecated: violates REST standard (nrecs value is not limited). Use websocket API instead.
GCL
GET
gcl/<id>?timeout=<#>
Wait for new data to appear on GCL
Deprecated: violates REST standard (subscription). Use websocket API instead.
GCL
GET
put/gcl/<id>?<arguments>
Add JSON record to GCL <id>
Deprecated: violates REST standard (arguments are not limited).
RESTful Server
POST
kv
Add JSON information to key-value store
RESTful Server
GET
kv/<key>
Return JSON information associated with <key> in the key-value store


4 Unresolved Issues

Do subscriptions wait until all data is ready to return, or does it return immediately as soon as any data appears?

How are signatures on POST methods handled?