spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheng, Hao" <hao.ch...@intel.com>
Subject RE: Performance tuning in Spark SQL.
Date Mon, 02 Mar 2015 13:21:35 GMT
This is actually a quite open question, from my understanding, there're probably ways to tune
like:



*        SQL Configurations like:



Configuration Key


Default Value


spark.sql.autoBroadcastJoinThreshold


10 * 1024 * 1024


spark.sql.defaultSizeInBytes


10 * 1024 * 1024 + 1


spark.sql.planner.externalSort


false


spark.sql.shuffle.partitions


200


spark.sql.codegen


false




*        Spark Cluster / Application Configuration (Memory, GC etc. Spark Core Number etc.)

*        Try using the Cached tables / Parquet Files as the storage.

*        "EXPLAIN [EXTENDED] query" is your best friend to tuning your SQL itself.

*        ...



And, a real use case scenario probably be more helpful in answering your question.



-----Original Message-----
From: dubey_a [mailto:Abhishek.Dubey@xoriant.com]
Sent: Monday, March 2, 2015 6:02 PM
To: user@spark.apache.org
Subject: Performance tuning in Spark SQL.



What are the ways to tune query performance in Spark SQL?







--

View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Performance-tuning-in-Spark-SQL-tp21871.html

Sent from the Apache Spark User List mailing list archive at Nabble.com.



---------------------------------------------------------------------

To unsubscribe, e-mail: user-unsubscribe@spark.apache.org<mailto:user-unsubscribe@spark.apache.org>
For additional commands, e-mail: user-help@spark.apache.org<mailto:user-help@spark.apache.org>



Mime
View raw message