A basis change needed by spark is setting the executor memory which defaults to 512MB by default.

On Mon, Mar 23, 2015 at 10:16 AM, Denny Lee <denny.g.lee@gmail.com> wrote:
How are you running your spark instance out of curiosity?  Via YARN or standalone mode?  When connecting Spark thriftserver to the Spark service, have you allocated enough memory and CPU when executing with spark?

On Sun, Mar 22, 2015 at 3:39 AM fanooos <dev.fanooos@gmail.com> wrote:
We have cloudera CDH 5.3 installed on one machine.

We are trying to use spark sql thrift server to execute some analysis
queries against hive table.

Without any changes in the configurations, we run the following query on
both hive and spark sql thrift server

*select * from tableName;*

The time taken by spark is larger than the time taken by hive which is not
supposed to be the like that.

The hive table is mapped to json files stored on HDFS directory and we are
using *org.openx.data.jsonserde.JsonSerDe* for
serialization/deserialization.

Why spark takes much more time to execute the query than hive ?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-sql-thrift-server-slower-than-hive-tp22177.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org




--

Sigmoid Analytics

Arush Kharbanda || Technical Teamlead

arush@sigmoidanalytics.com || www.sigmoidanalytics.com