title: Choosing a stream processing technology description: author: zoinerTejada ms:date: 02/12/2018

Choosing a stream processing technology in Azure

This article compares technology choices for real-time stream processing in Azure.

Real-time stream processing consumes messages from either queue or file-based storage, process the messages, and forward the result to another message queue, file store, or database. Processing may include querying, filtering, and aggregating messages. Stream processing engines must be able to consume an endless streams of data and produce results with minimal latency. For more information, see Real time processing.

What are your options when choosing a technology for real-time processing?

In Azure, all of the following data stores will meet the core requirements supporting real-time processing:

Azure Stream Analytics
HDInsight with Spark Streaming
Apache Spark in Azure Databricks
HDInsight with Storm
Azure Functions
Azure App Service WebJobs

Key Selection Criteria

For real-time processing scenarios, begin choosing the appropriate service for your needs by answering these questions:

Do you prefer a declarative or imperative approach to authoring stream processing logic?
Do you need built-in support for temporal processing or windowing?
Does your data arrive in formats besides Avro, JSON, or CSV? If yes, consider options support any format using custom code.
Do you need to scale your processing beyond 1 GB/s? If yes, consider the options that scale with the cluster size.

Capability matrix

The following tables summarize the key differences in capabilities.

General capabilities

	Azure Stream Analytics	HDInsight with Spark Streaming	Apache Spark in Azure Databricks	HDInsight with Storm	Azure Functions	Azure App Service WebJobs
Programmability	Stream analytics query language, JavaScript	Scala, Python, Java	Scala, Python, Java, R	Java, C#	C#, F#, Node.js	C#, Node.js, PHP, Java, Python
Programming paradigm	Declarative	Mixture of declarative and imperative	Mixture of declarative and imperative	Imperative	Imperative	Imperative
Pricing model	Streaming units	Per cluster hour	Databricks units	Per cluster hour	Per function execution and resource consumption	Per app service plan hour

Integration capabilities

	Azure Stream Analytics	HDInsight with Spark Streaming	Apache Spark in Azure Databricks	HDInsight with Storm	Azure Functions	Azure App Service WebJobs
Inputs	Stream Analytics inputs	Event Hubs, IoT Hub, Kafka, HDFS, Storage Blobs, Azure Data Lake Store	Event Hubs, IoT Hub, Kafka, HDFS, Storage Blobs, Azure Data Lake Store	Event Hubs, IoT Hub, Storage Blobs, Azure Data Lake Store	Supported bindings	Service Bus, Storage Queues, Storage Blobs, Event Hubs, WebHooks, Cosmos DB, Files
Sinks	Stream Analytics outputs	HDFS, Kafka, Storage Blobs, Azure Data Lake Store, Cosmos DB	HDFS, Kafka, Storage Blobs, Azure Data Lake Store, Cosmos DB	Event Hubs, Service Bus, Kafka	Supported bindings	Service Bus, Storage Queues, Storage Blobs, Event Hubs, WebHooks, Cosmos DB, Files

Processing capabilities

	Azure Stream Analytics	HDInsight with Spark Streaming	Apache Spark in Azure Databricks	HDInsight with Storm	Azure Functions	Azure App Service WebJobs
Built-in temporal/windowing support	Yes	Yes	Yes	Yes	No	No
Input data formats	Avro, JSON or CSV, UTF-8 encoded	Any format using custom code	Any format using custom code	Any format using custom code	Any format using custom code	Any format using custom code
Scalability	Query partitions	Bounded by cluster size	Bounded by Databricks cluster scale configuration	Bounded by cluster size	Up to 200 function app instances processing in parallel	Bounded by app service plan capacity
Late arrival and out of order event handling support	Yes	Yes	Yes	Yes	No	No

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stream-processing.md

stream-processing.md

title: Choosing a stream processing technology description: author: zoinerTejada ms:date: 02/12/2018

Choosing a stream processing technology in Azure

What are your options when choosing a technology for real-time processing?

Key Selection Criteria

Capability matrix

General capabilities

Integration capabilities

Processing capabilities

Files

stream-processing.md

Latest commit

History

stream-processing.md

File metadata and controls

title: Choosing a stream processing technology description: author: zoinerTejada ms:date: 02/12/2018

Choosing a stream processing technology in Azure

What are your options when choosing a technology for real-time processing?

Key Selection Criteria

Capability matrix

General capabilities

Integration capabilities

Processing capabilities