General

Profile

News

GDP: Preview of GDP version 2.1

Added by Eric Allman about 2 months ago

Following is a brief summary of significant user-visible features and changes expected to be in version 2.1 of the Global Data Plane. Note: version 2.1 has not yet been released, and not all of the described features have been pushed to the repository.

This summary is still a rough draft. Everything herein is subject to change.

Overview

Log Naming

One of the major intended properties of the GDP is that readers can verify the provenance of the data that they read. This is predicated on the reader having the correct internal name of the log, which is a SHA-256 of the log metadata. The metadata includes the public key matching the secret key needed to sign data written to the log. Thus, given the internal name of the log, one can guarantee that the metadata matches the desired log, and the data signatures match the key in the metadata.

However, humans don't work well with long binary strings, so there has to be some way of mapping a human-oriented name to the GDP name. Before 2.1, GDP names were computed as the SHA-256 of the human-readable name and these security properties did not apply. The new technique is to use a Human-Oriented Name to GDPname Directory, as described in the next section.

The change in the algorithm used to create the 256-bit GDPname of a log means that in general, the old human-oriented names will no longer work to access logs that were created using a different algorithm. See the What This Means section for a deeper discussion of this issue.

Human-Oriented Name to GDPname Directory (HONGDS)

Since there is no longer an algorithmic way to derive an internal GDPname from a human-oriented name, this now has to be done using a directory service. This service maintains a database mapping the human-oriented name to the internal GDPname.

The current implementation is a stub that directly accesses a MySQL database from applications. This has several implications:

  • The implementation is easily spoofable; in particular, there is no way for a client to confirm the authenticity of a name mapping.
  • The implementation doesn't scale well, since the database is not distributed or replicated.
  • The implementation isn't durable.

When new logs are created the human-oriented name is inserted into the log metadata, so given just the logs themselves it is theoretically possible to rebuild this database, but this is also not scalable. For now, this service should be considered to be a work in progress.

Docker Packaging

  • Easier installation.
  • Multiple versions can easily run in parallel.

Miscellaneous Smaller Changes

There is a new system environment variable GDP_NAME_ROOT that you can use to simulate namespaces. For example, if GDP_NAME_ROOT=edu.berkeley.edu.eecs.eric then specifying a name such as test on the command line will actually search for edu.berkeley.edu.eecs.eric.test. Input names that already have dots will be tried as originally specified before prefixing the qualification.

Logs now have two keypairs: owner and writer. Previously they only had a writer keypair. If no writer keypair exists, the owner keypair will be replicated into the writer keypair. This should be largely transparent.

Results read using asynchronous reads of large data sets will be returned more reliably in version 2.1. Previously, if the network layer reordered results (e.g., due to packet loss) some data might not be delivered to the application.

What This Means For You

Naming Changes

Old logs will not be accessible using the names you are familiar with unless you use compatibility workarounds (see below).

Since existing logs were created using a different algorithm for computing the name, the primary security property (verifying that the name matches the metadata) will fail. Old logs must be accessed without checking this condition. See the Back Compatibility section for more information.

Installation and Configuration

  • Use docker packages.
  • Install MySQL for Human-Oriented Name to GDPname Directory Service (HONGDS). [There should be a script to do this.]
  • Initialize HONGDS database.
  • Create pointer to HONGDS using administrative parameters. See the Details section for more information.

Log Creation

There is a new programmatic API for log creation. In particular, gdp_gin_create has changed substantially. Programs that wish to create logs directly will need to be updated.

Scripts that use gdp-create to create logs will probably not be affected.

The existing Log Creation Service will need to be updated to update HONGDS in addition to its other functions.

Back Compatibility

To ease the transition between the old method of naming logs (using the SHA-256 of the human-oriented name) to the new method (using the SHA-256 of the log metadata), the administrative parameter swarm.gdp.compat.lognames can be set to true.
* Manually add entries to HONGDS database to render existing names usable.

Details

Log Creation

New and Changed APIs

  • gdp_create_info_new — added.
  • gdp_create_info_free — added.
  • gdp_create_info_add_metadata — added.
  • gdp_create_info_new_owner_key — added.
  • gdp_create_info_new_writer_key — added.
  • gdp_create_info_save_keys — added.
  • gdp_create_info_set_creator — added.
  • gdp_create_info_set_expiration — added. This API will probably change in the future.
  • gdp_create_info_set_owner_key — added.
  • gdp_create_info_set_writer_key — added.
  • gdp_gin_create — new parameters.
  • gdp_name_parse — was gdp_parse_name; new parameters. The old API still exists but is deprecated. Works with the Human to GDP Name Directory.
  • gdp_name_root_set — added.
  • gdp_name_root_get — added.

Changed Applications

There are several updates to the gdp-create command:

  • -K sets owner key location. As before, if it points to a file it should be an existing, previously created secret key; otherwise it should be a directory, into which the secret key will be saved. The default is to look for a subdirectory named KEYS, and if not found use the current directory.
  • The new flag -W is equivalent to -K, but for the writer key.
  • A new flag -w specifies that separate owner and writer keys should be created; by default the owner key is used as the writer key. If -W is specified and points to a file, -w is implied.

Human-Oriented Name to GDPname Directory

Configuration

There are several runtime configuration parameters controlling access to the Human-Oriented Name to GDPName Directory. Only the first of these is likely to be needed in most cases.

  • swarm.gdp.namedb.host — the IP name of the MySQL server host; generally the only parameter that must be set assuming the local server was set up using the standard defaults.
  • swarm.gdp.namedb.user
  • swarm.gdp.namedb.passwd
  • swarm.gdp.namedb.database
  • swarm.gdp.namedb.table

New APIs

  • gdp_name_resolve — accesses the Human to GDP Name Directory (added). Most applications should call gdp_name_parse.
  • gdp_name_update — added.

New Applications

  • gdp-name-add

Security

New and Changed APIs

  • gdp_datum_vrfy — added.
  • gdp_open_info_set_vrfy — new API to turn on read-side proof (signature) validation.
  • Searches for secret keys slightly expanded: swarm.gdp.crypto.key.dir is now a search path.

Changed Applications

  • gdp-reader-V flag to verify read results.

Miscellaneous

New and Changed APIs

  • ep_dbg_backtrace — takes a file pointer parameter (used to default to the debug file).
  • ep_file_search — resolve a fliesystem search path (added).
  • ep_funclist_push — changed paramters for called function.
  • ep_time_diff_usec — added.
  • ep_time_from_nsec — new parameter.
  • ep_time_from_sec — added.
  • ep_time_from_usec — added.
  • ep_time_zero — added.

Changed Applications

  • gdp-reader-o flag to set data output location.
  • log-view — for the moment, no longer supported.

Changed Semantics

  • Environment can override administrative parameters (on a compile flag; may not survive the cut due to security issues).
  • Asynchronous results handled better (requires new generation of router). Previously large asynchronous data reads might terminate early in the presense of network glitches.
  • Some parameter renaming for consistency:
    • swarm.gdp.crypto.md.algswarm.gdp.crypto.digest.alg

GDP: Updated Version of GDP Coming

Added by Eric Allman 6 months ago

We will soon roll out an updated version of the Global Data Plane (GDP) with new functionality and better suited for continuing our research agenda. However, it will not be 100% compatible with the previous version. We'll continue to run the old system in parallel with the new system for a short time, but at some point you will need to do an upgrade if you need continuing service.

If you are using the legacy GDP, please let us know as soon as possible exactly what you are using, notably:

  • The C API library.
  • The Python API.
  • Java and/or Javascript API.
  • Command line clients (gdp-reader, gdp-writer, etc.)
  • The RESTful interface.
  • The Websockets interface.
  • The visualization interface at http://swarmnuc1022.eecs.berkeley.edu/static/.

This information will help us prioritize where we put our effort. Please send feedback to gdp-sysad@lists.eecs.berkeley.edu.

If you are interested in the new version, please subscribe to the news feed at https://gdp.cs.berkeley.edu/redmine/projects/gdp/news to get an announcement when it becomes available.

[To subscribe you will need to be a registered user and then click the "Watch" button on the news page. If you're not already registered, click the "Register" button at the top left of the screen. See the home page (https://gdp.cs.berkeley.edu/) for the registration policy.]

GDP: gdp-02 and gdp-04 unavailable

Added by Eric Allman 6 months ago

The servers gdp-02 and gdp-04 have been down for a couple of days due to network work at BWRC. We are told that the network should be restored by sometime next week (that is, the week of 6/18).

GDP: GDP 0.8.2 released

Added by Eric Allman over 1 year ago

The GDP version number has been bumped to 0.8.2. This is a patch release — there should be no major changes. However, a number of bugs were fixed, so people who use the GDP library and/or applications should update.

GDP: GDP maintenance completed

Added by Eric Allman over 1 year ago

The GDP software on our servers has been updated and log purging and repair is complete. Everything should be back up now.

We attempted to get rid of what appeared to be abandoned logs (over 25,000 of them!). If we accidentally purged your favorite log please let us know and we will restore it.

GDP: GDP down for maintenance Friday 9/8 (morning)

Added by Eric Allman over 1 year ago

On Friday 9/8 the GDP servers (and hence all access to logs) will be down for maintenance. We'll make it as early as possible (probably around 7:00am). It's hard to know how long it will be down, but it will be somewhere in the range of 1–2 hours.

GDP: GDP source code now open

Added by Eric Allman almost 2 years ago

The GDP source code is now open for public access If you have an account on repo.eecs.berkeley.edu and you have uploaded your ssh key, you can use:

git clone git://repo.eecs.berkeley.edu/projects/swarmlab/gdp.git

For anonymous access, use:

git clone https://repo.eecs.berkeley.edu/git-anon/projects/swarmlab/gdp.git

GDP: Upcoming Service Outage on 22 December

Added by Eric Allman about 2 years ago

The EECS department will be doing a full power shutdown on December 22 from 7:00am to 11:00pm (Pacific Time, 15:00 12/22 to 07:00 12/23 UTC) to perform fire and emergency power testing in the server rooms. This will include the code repository (repo.eecs.berkeley.edu) where the GDP code is hosted, and the EECS network, which will mean that the GDP servers will not be accessible. Please plan accordingly.

See https://iris.eecs.berkeley.edu/news/16954-eecs-full-network-and-service for details and status updates.

(1-10/17)

Also available in: Atom