spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yogesh Mahajan <mahajan.yog...@gmail.com>
Subject Re: CQs on WindowedStream created on running StreamingContext
Date Tue, 06 Oct 2015 16:59:40 GMT
Anyone knows about this ? TD ?

-yogesh

> On 30-Sep-2015, at 1:25 pm, Yogs <mahajan.yogesh@gmail.com> wrote:
> 
> Hi, 
> 
> We intend to run adhoc windowed continuous queries on spark streaming data. The queries
could be registered/deregistered dynamically or can be submitted through command line. Currently
Spark streaming doesn’t allow adding any new inputs, transformations, and output operations
after starting a StreamingContext. But doing following code changes in DStream.scala allows
me to create an window on DStream even after StreamingContext has started (in StreamingContextState.ACTIVE).

> 
> 1) In DStream.validateAtInit()
> Allowed adding new inputs, transformations, and output operations after starting a streaming
context
> 2) In DStream.persist()
> Allowed to change storage level of an DStream after streaming context has started
> 
> Ultimately the window api just does slice on the parentRDD and returns allRDDsInWindow.
> We create DataFrames out of these RDDs from this particular WindowedDStream, and evaluate
queries on those DataFrames. 
> 
> 1) Do you see any challenges and consequences with this approach ? 
> 2) Will these on the fly created WindowedDStreams be accounted properly in Runtime and
memory management?
> 3) What is the reason we do not allow creating new windows with StreamingContextState.ACTIVE
state?
> 4) Does it make sense to add our own implementation of WindowedDStream in this case?
> 
> - Yogesh 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message