Skip to content

Latest commit

 

History

History
289 lines (198 loc) · 9.34 KB

kafka-protocol-binding.md

File metadata and controls

289 lines (198 loc) · 9.34 KB

Kafka Protocol Binding for CloudEvents - Version 1.0

Abstract

The Kafka Protocol Binding for CloudEvents defines how events are mapped to Kafka messages.

Table of Contents

  1. Introduction
  1. Use of CloudEvents Attributes
  1. Kafka Message Mapping
  1. References

1. Introduction

CloudEvents is a standardized and protocol-agnostic definition of the structure and metadata description of events. This specification defines how the elements defined in the CloudEvents specification are to be used in the Kafka protocol as Kafka messages (aka Kafka records).

1.1. Conformance

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119.

1.2. Relation to Kafka

This specification does not prescribe rules constraining transfer or settlement of event messages with Kafka; it solely defines how CloudEvents are expressed in the Kafka protocol as Kafka messages.

1.3. Content Modes

The specification defines two content modes for transferring events: structured and binary.

In the structured content mode, event metadata attributes and event data are placed into the Kafka message value section using an event format.

In the binary content mode, the value of the event data MUST be placed into the Kafka message's value section as-is, with the content-type header value declaring its media type; all other event attributes MUST be mapped to the Kafka message's header section.

Implementations that use Kafka 0.11.0.0 and above MAY use either binary or structured modes. Implementations that use Kafka 0.10.x.x and below MUST use only use structured mode and encode the event in JSON. This is because older versions of Kafka lacked support for message level headers.

1.4. Event Formats

Event formats, used with the structured content mode, define how an event is expressed in a particular data format. All implementations of this specification MUST support the JSON event format.

1.5. Security

This specification does not introduce any new security features for Kafka, or mandate specific existing features to be used.

2. Use of CloudEvents Attributes

This specification does not further define any of the CloudEvents event attributes.

2.1. data

data is assumed to contain opaque application data that is encoded as declared by the datacontenttype attribute.

An application is free to hold the information in any in-memory representation of its choosing, but as the value is transposed into Kafka as defined in this specification, core Kafka provides data available as a sequence of bytes.

For instance, if the declared datacontenttype is application/json;charset=utf-8, the expectation is that the data value is made available as UTF-8 encoded JSON text.

3. Kafka Message Mapping

With Kafka 0.11.0.0 and above, the content mode is chosen by the sender of the event. Protocol usage patterns that might allow solicitation of events using a particular content mode might be defined by an application, but are not defined here.

The receiver of the event can distinguish between the two content modes by inspecting the content-type Header of the Kafka message. If the header is present and its value is prefixed with the CloudEvents media type application/cloudevents, indicating the use of a known event format, the receiver uses structured mode, otherwise it defaults to binary mode.

If a receiver finds a CloudEvents media type as per the above rule, but with an event format that it cannot handle, for instance application/cloudevents+avro, it MAY still treat the event as binary and forward it to another party as-is.

If the content-type header is not present then the receiver uses structured mode with the JSON event format.

3.1. Key Attribute

The 'key' attribute is populated by a partitionKeyExtractor function. The partitionKeyExtractor is a protocol specific function that contains bespoke logic to extract and populate the value. A default implementation of the extractor will use the Partitioning extension value.

3.2. Binary Content Mode

The binary content mode accommodates any shape of event data, and allows for efficient transfer and without transcoding effort.

3.2.1. Content Type

For the binary mode, the header content-type property MUST be mapped directly to the CloudEvents datacontenttype attribute.

3.2.2. Event Data Encoding

The data byte-sequence MUST be used as the value of the Kafka message.

3.2.3. Metadata Headers

All CloudEvents attributes and CloudEvent Attributes Extensions with exception of data MUST be individually mapped to and from the Header fields in the Kafka message.

3.2.3.1 Property Names

CloudEvent attributes are prefixed with ce_ for use in the message-headers section.

Examples:

* `time` maps to `ce_time`
* `id` maps to `ce_id`
* `specversion` maps to `ce_specversion`
3.2.4.2 Property Values

The value for each Kafka header is constructed from the respective header's Kafka representation, compliant with the Kafka message format specification.

3.2.5 Example

This example shows the binary mode mapping of an event into the Kafka message. All other CloudEvents attributes are mapped to Kafka Header fields with prefix ce_.

Mind that ce_ here does refer to the event data content carried in the payload.

------------------ Message -------------------

Topic Name: mytopic

------------------- key ----------------------

Key: mykey

------------------ headers -------------------

ce_specversion: "1.0"
ce_type: "com.example.someevent"
ce_source: "/mycontext/subcontext"
ce_id: "1234-1234-1234"
ce_time: "2018-04-05T03:56:24Z"
content-type: application/avro
       .... further attributes ...

------------------- value --------------------

            ... application data encoded in Avro ...

-----------------------------------------------

3.3. Structured Content Mode

The structured content mode keeps event metadata and data together in the payload, allowing simple forwarding of the same event across multiple routing hops, and across multiple protocols.

3.3.1. Kafka Content-Type

If present, the Kafka message header property content-type MUST be set to the media type of an event format.

Example for the JSON format:

content-type: application/cloudevents+json; charset=UTF-8

3.3.2. Event Data Encoding

The chosen event format defines how all attributes, and data, are represented.

The event metadata and data are then rendered in accordance with the event format specification and the resulting data becomes the Kafka application data section.

3.3.3. Metadata Headers

Implementations MAY include the same Kafka headers as defined for the binary mode.

3.3.4 Example

This example shows a JSON event format encoded event:

------------------ Message -------------------

Topic Name: mytopic

------------------- key ----------------------

Key: mykey

------------------ headers -------------------

content-type: application/cloudevents+json; charset=UTF-8

------------------- value --------------------

{
    "specversion" : "1.0",
    "type" : "com.example.someevent",
    "source" : "/mycontext/subcontext",
    "id" : "1234-1234-1234",
    "time" : "2018-04-05T03:56:24Z",
    "datacontenttype" : "application/xml",

    ... further attributes omitted ...

    "data" : {
        ... application data encoded in XML ...
    }
}

-----------------------------------------------

4. References

  • Kafka The distributed stream platform
  • Kafka-Message-Format The Kafka format message
  • RFC2046 Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types
  • RFC2119 Key words for use in RFCs to Indicate Requirement Levels
  • RFC3629 UTF-8, a transformation format of ISO 10646
  • RFC7159 The JavaScript Object Notation (JSON) Data Interchange Format