spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liang-Chi Hsieh <vii...@gmail.com>
Subject Re: Speeding up Catalyst engine
Date Mon, 24 Jul 2017 23:39:09 GMT

Hi Maciej,

For backportting https://issues.apache.org/jira/browse/SPARK-20392, you can
see the suggestion from committers on the PR. I think we don't expect it
will be merged into 2.2.



Maciej Bryński wrote
> Hi Everyone,
> I'm trying to speed up my Spark streaming application and I have following
> problem.
> I'm using a lot of joins in my app and full catalyst analysis is triggered
> during every join.
> 
> I found 2 options to speed up.
> 
> 1) spark.sql.selfJoinAutoResolveAmbiguity  option
> But looking at code:
> https://github.com/apache/spark/blob/8cd9cdf17a7a4ad6f2eecd7c4b388ca363c20982/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L918
> 
> Shouldn't lines 925-927 be before 920-922 ?
> 
> 2) https://issues.apache.org/jira/browse/SPARK-20392
> 
> Is it safe to use it on top of 2.2.0 ?
> 
> Regards,
> -- 
> Maciek Bryński





-----
Liang-Chi Hsieh | @viirya 
Spark Technology Center 
http://www.spark.tc/ 
--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Speeding-up-Catalyst-engine-tp22013p22014.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message