samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagadish Venkatraman <>
Subject Re: Comparison between Samza and Kafka Streams
Date Fri, 24 Nov 2017 16:47:30 GMT
Thanks for the feedback Giridhar!

We'll add a comparison with KStreams there as well.

Roughly, the two are similar - The design of Samza certainly influenced
what went
into Kafka Streams. However, here are some key differences:

- Support for non-Kafka source and sink natively: Samza has native
for various systems like ElasticSearch, AWS Kinesis, Azure EventHubs, HDFS
in the
open-source. This has cost-benefits if you don't want to maintain dual
copies to import
the data into Kafka.

- Async-mode: At LinkedIn, we have observed that jobs are bottle-necked by
remote I/O.
For this reason, we built native async-processing into Samza. As far as I
can remember
, Samza is the only stream processor that supports this feature (as of
early 2017).

- Stability at LinkedIn: We run Samza in production at LinkedIn, and it's
battle-tested at scale
powering all of our near-realtime processing use-cases. On YARN, Samza
supports durable local
state and host-affinity for instant state recovery. We have made
improvements to this by
adding incremental checkpointing.

- Single API and SQL for streaming and batch processing: Samza can run the
same code on
both batching and streaming sources. We just added SQL support in the

PS: Some of this discussion is based on Kartik's and Yi's earlier responses
in 2016.

Yi's earlier response:

Kartik's earlier response:

On Thu, Nov 23, 2017 at 10:15 PM, Giridhar Addepalli <
> wrote:

> Hi,
> Thank you for providing comparison between Samza and Spark Streaming,
> Mupd8, Storm.
> Looks like there is new player in the field : Kafka Streams (
> It will good to have comparison between Samza and Kafka Streams as well.
> From high-level it looks like "Samza when used as a library" is similar to
> "Kafka Streams".
> Thanks,
> Giridhar.

Jagadish V,
Graduate Student,
Department of Computer Science,
Stanford University

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message