- The Model
- The Present and the Future
- GDP Code and Documentation
- Possible Student Projects
This project covers the core GDP functionality. In particular, this is home for the client-side C library, various language bindings around the C-library, log-server, various utilities around it, and some little projects related to the Urban Heartbeat. We hope to eventually clean it up and migrate components other than the core functionality (such as Urban Heartbeat stuff) to their own independent projects. For an overview, see wiki.
As we discuss in a recent paper: The Cloud is not Enough: Saving IoT from the Cloud, the widespread practice of constructing Swarm applications by directly connecting with the cloud comes with a variety of downsides. With the GDP, we seek an infrastructure that enables important new use-cases for the cloud while still integrating smoothly with existing Cloud infrastructure.
The Global Data Plane (GDP) provides a data-centric glue for swarm applications. The basic primitive is that of a secure single-writer append-only log. Data inputs are timestamped and rearranged by timestamp. Data can be securely committed to the log in a variety of ways, including via a external consistent transactional model. Data within the log can be read (either randomly or by subscription), thereby permitting a variety of data models, including (eventually) a SQL query model. Further, data within a log can be preserved for the long term.
The GDP consists of append-only logs and a routing layer. Each log is named by an opaque 256-bit number (a "GDPname") having no direct connection with the location of the data. Logs are append-only and consist of a series of records; records consist of a record number, a commit timestamp (for coherency), and variable-sized opaque data. It is important that the data be opaque since (in the longer term) all data should be encrypted, and the GDP will not hold the keys. Logs can be replicated and migrated.
Routing allows any node in the GDP to find at least one copy of any named entity (with the GDPname). Entities can be logs, but ultimately they may include services and users (so there is one shared namespace for everything).
The GDP implements mechanism, but not policy, which is mediated by a separate Control Plane. For example, the GDP handles the mechanics of replication and migration, but the choice of when and where to replicate is made by a higher level service that resides in the Control Plane on the basis of on-the-fly performance monitoring or other criteria.
Access control is based on cryptography. Write (append) access control is based on valid writers signing the message, with the GDP itself holding the public keys of authorized writers for verification. Read/Subscribe access control actually does not exist; the privacy of the data depends on the data being encrypted.
Higher level services may be layered on the GDP; for example, a service might combine the results of multiple logs into another log, or copy log data to specialized databases. In these cases the original logs are the "base truth", with everything else being a form of cache. Note however that these services are not part of the GDP, but rather are users of it.
The GDP needs to be self-healing and resistant to attack. Network partitions might in severe cases result in inaccessible data, but single node failure should not, and in no case should the GDP suffer catastrophic failure, even in the face of some nodes being compromised.
The Present and the Future¶
The prototype implementation of the GDP has limited functionality: a single instance daemon with basic read/subscribe/publish primitives. Data is completely opaque (this is a feature, since the intent is that all data will be encrypted), and the only metadata is a record number and a commit timestamp. There is no access control on either read or write.
There are several short- or medium-term projects that are either in progress or will start soon. See https://gdp.cs.berkeley.edu/redmine/projects/gdp/wiki/GDP_Task_List for details.
GDP Code and Documentation¶
The Global Dataplane Code initial prototype is available on the U.C. Berkeley EECS repository at one of these URIs:
* https://repo.eecs.berkeley.edu/git-anon/projects/swarmlab/gdp.git (anonymous)
Possible Student Projects¶
See GDP Project List for summaries of possible projects that should be "student sized".
- Subprojects: GDP Interfaces
GDP infrastructure back up after power outage
Campus has power again, and the GDP infrastructure appears to be up and functional
GDP 2.1.19: changed default for HONGD availability
Default has changed so that HONGD is required by default in 2.1.19