spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <jornfra...@gmail.com>
Subject Re: Multiple queries on same stream
Date Wed, 09 Aug 2017 06:12:05 GMT
This is not easy to say without testing. It depends on type of computation etc. it also depends
on the Spark version. Generally vectorization / SIMD could be much faster if it is applied
by Spark / the JVM in scenario 2.

> On 9. Aug 2017, at 07:05, Raghavendra Pandey <raghavendra.pandey@gmail.com> wrote:
> 
> I am using structured streaming to evaluate multiple rules on same running stream. 
> I have two options to do that. One is to use forEach and evaluate all the rules on the
row.. 
> The other option is to express rules in spark sql dsl and run multiple queries. 
> I was wondering if option 1 will result in better performance even though I can get catalyst
optimization in option 2.
> 
> Thanks 
> Raghav 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message