nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Psaltis <>
Subject Re: Spark Streaming with Nifi
Date Sun, 04 Jun 2017 14:10:27 GMT
Hi Shashi,
I'm sure there is a way to make this work. However, my first question is
why you would want to? By design a Spark Streaming application should
always be running and consuming data from some source, hence the notion of
streaming. Tying Spark Streaming to NiFi would ultimately result in a more
coupled and fragile architecture. Perhaps a different way to think about it
would be to set things up like this:

NiFi --> Kafka <-- Spark Streaming

With this you can do what you are doing today -- using NiFi to ingest,
transform, make routing decisions, and feed data into Kafka. In essence you
would be using NiFi to do all the preparation of the data for Spark
Streaming. Kafka would serve the purpose of a buffer between NiFi and Spark
Streaming. Finally, Spark Streaming would ingest data from Kafka and do
what it is designed for -- stream processing. Having a decoupled
architecture like this also allows you to manage each tier separately, thus
you can tune, scale, develop, and deploy all separately.

I know I did not directly answer your question on how to make it work. But,
hopefully this helps provide an approach that will be a better long term
solution. There may be something I am missing in your initial questions.


On Sat, Jun 3, 2017 at 10:43 PM, Shashi Vishwakarma <> wrote:

> Hi
> I am looking for way where I can make use of spark streaming in Nifi. I
> see couple of post where SiteToSite tcp connection is used for spark
> streaming application but I thinking it will be good If I can launch Spark
> streaming from Nifi custom processor.
> PublishKafka will publish message into Kafka followed by Nifi Spark
> streaming processor will read from Kafka Topic.
> I can launch Spark streaming application from custom Nifi processor using
> Spark Streaming launcher API but biggest challenge is that it will create
> spark streaming context for each flow file which can be costly operation.
> Does any one suggest storing spark streaming context  in controller
> service ? or any better approach for running spark streaming application
> with Nifi ?
> Thanks and Regards,
> Shashi


Subscribe to my book: Streaming Data <>
twiiter: @itmdata <>

View raw message