spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yogesh Mahajan <ymaha...@snappydata.io>
Subject Re: Max number of streams supported ?
Date Thu, 01 Feb 2018 00:46:00 GMT
Thanks Michael, TD for quick reply. It was helpful. I will let you know the
numbers(limit) based on my experiments.

On Wed, Jan 31, 2018 at 3:10 PM, Tathagata Das <tathagata.das1565@gmail.com>
wrote:

> Just to clarify a subtle difference between DStreams and Structured
> Streaming. Multiple input streams in a DStreamGraph is likely to mean they
> are all being processed/computed in the same way as there can be only one
> streaming query / context active in the StreamingContext. However, in the
> case of Structured Streaming, there can be any number of independent
> streaming queries (i.e. different computations), and each streaming query
> with any number if separate input sources. So Michael's comment of "each
> stream will have a thread on the driver" is correct when there are many
> independent queries with different computations simultaneously running.
> However if all your streams need to be processed in the same way, then its
> one streaming query with many inputs, and will require one thread.
>
> Hope this helps.
>
> TD
>
> On Wed, Jan 31, 2018 at 12:39 PM, Michael Armbrust <michael@databricks.com
> > wrote:
>
>> -dev +user
>>
>>
>>> Similarly for structured streaming, Would there be any limit on number
>>> of of streaming sources I can have ?
>>>
>>
>> There is no fundamental limit, but each stream will have a thread on the
>> driver that is doing coordination of execution.  We comfortably run 20+
>> streams on a single cluster in production, but I have not pushed the
>> limits.  You'd want to test with your specific application.
>>
>
>

Mime
View raw message