Skip to content

adstream.data backend

MaxMotovilov edited this page Jan 21, 2012 · 19 revisions

Creating backend services compatible with adstream.data

Basic requirements

A compatible backend service has to comply with the adstream.data protocol specification and be accessible via a URL from the same security domain as the Web page hosting the application. Compatible services can be implemented using any backend technology capable of:

  1. Handling primary REST verbs — GET, POST, PUT and DELETE;
  2. Processing JSON request bodies where applicable;
  3. Generating JSON responses (with content type application/json).

Using a backend framework

In order to support the features provided by adstream.data, the protocol and its data envelopes have to be relatively complex. While it is quite possible to reduce this complexity by complying with the specification only partially and ensuring the front end code avoids the unsupported parts, a better alternative is offloading it onto reusable framework components. At present, there are two compliant backend frameworks available as part of adstream.data:

  1. Python library adstream.data compatible with a variety of Python Web backends and application frameworks, including Twisted and Django;
  2. Javascript library adstream-data implemented as a Connect middleware module for Node.js.

The two libraries have similar design but somewhat dissimilar APIs. This page is dedicated to the conceptually similar parts and the API details are provided in the Backend API Reference: Python and the Backend API Reference: Node.js, respectively.

Backend framework concepts

URL schema

The backend service provides the data objects that the front end application code operates upon; the objects are organized into a hierarchical REST urlspace according to a schema shared between the two layers. At the moment, neither of the backend libraries shares the actual schema definition with the front end library: the Python framework uses an explicit schema constructed as a class definition, the Node.js framework uses an implicit schema derived from the routing rules the service code sets up while the front end adstream.data expects the application code to build a schema from library-provided object classes. Nevertheless, the logical URL schema is the same on the front end and the backend and the services associate their request processing handlers with the objects from the schema.

Object types and keys

According to the design of adstream.data, objects in the tree described by the URL schema may have either:

  • A fixed number of named children of potentially different structure (applies to Node and Object), or
  • A variable number of identically structured children, identified by dynamically assigned item IDs (applies to Container).

This structure of the urlspace allows to associate an equivalent of object type with every URL by replacing all item IDs with a placeholder symbol, e.g. *. Thus, objects at URLs item/1/subitem/2 and item/2/subitem/1 would have an identical type item/*/subitem/*. All objects of this type are expected to describe similar logical entities and, more importantly, be handled by the same code. A sequence of the item IDs from the URL is then sufficient to identify the objects within a type. This combination is called a key — in the example above, the keys are [1,2] and [2,1] respectively.

Note that while the Node.js version of the framework only uses these object types implicitly, the Python version actually requires the user to associate each object type with a named Python class.

Indirect routing

The adstream.data protocol is designed to request data — or modify state — of multiple server-side objects in a single HTTP request. The scope of each request is thus controlled by the front end application code and the same server-side object may be affected by many different request types, directed to the URL of the object itself or to the URLs of the parent objects. Thus, direct association of HTTP URLs with service handlers — usually known as request routing — makes little sense in this context as the code specific to each object would have to either be replicated across multiple handlers or organized into a separate layer and the handlers will have to separately determine the right subset of objects affected by the request.
Instead, the framework takes the latter burden upon itself by providing an indirect routing facility:

  • Handlers implemented by the service code are specific to the object type (place of the object in the schema) and to the actual operation to be performed upon the object: get, create, update or delete;
  • Handlers may be set up to operate on individual objects — where traversing through items/1 and items/2 would call the appropriate handler twice — or on object batches — where in the same situation the handler is only called once with both objects passed in as parameters;
  • The framework interprets all requests by traversing the logical tree of objects downwards from the point indicated by the URL in the request, taking into account the REST verb, content of the request body and/or the predefined URL parameters (depth for GET requests, version for DELETE requests);
  • The framework makes the determination of handlers to be called, batching up the calls where appropriate;
  • The handlers are called in the parent-first order. Python library executes all handlers synchronously; the Node.js library executes handlers asynchronously, assuming that child handlers are dependent on the results of parent handlers.

Handler execution context

In addition to call-specific parameters such as the keys of objects to operate upon, each handler also has access to two shared data structures — request and response. The first one logically corresponds to a combination of the request body and URL parameters mapped to the object schema; the second one corresponds to the body of the response. Both structures provide [similar] APIs allowing the handler code to access individual object representations by their URLs or their types and keys. The response is where the results of handler execution are accumulated; each handler is expected to modify the response accordingly. The request is sometimes modified as a side effect of handler execution but is best thought of as a logically immutable input parameter.

Error handling

All versions of the library provide facilities for the handler code to signal processing errors, specifying the HTTP status code to be returned and associated information (i.e. the error message). The libraries always send back a valid JSON packet in case of error. HTTP status code 409 (Conflict) is expected to be used in cases of modification conflict determined from the version metadata: the library will automatically invoke the appropriate get handlers as part of the handling and package the results along with the error-related information.

Backend service structure

An adstream.data-compliant service created with one of the backend frameworks is a combination of:

  • Schema definition that associates service handlers with objects in the urlspace and, in case of the Python library, defines how the objects are serialized and deserialized;
  • Handlers specific to types of objects and operations upon them;
  • Glue code that directs control from the external Web application infrastructure to the framework.