spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denny Lee <denny.g....@gmail.com>
Subject Re: Spark sql thrift server slower than hive
Date Mon, 23 Mar 2015 04:46:18 GMT
How are you running your spark instance out of curiosity?  Via YARN or
standalone mode?  When connecting Spark thriftserver to the Spark service,
have you allocated enough memory and CPU when executing with spark?

On Sun, Mar 22, 2015 at 3:39 AM fanooos <dev.fanooos@gmail.com> wrote:

> We have cloudera CDH 5.3 installed on one machine.
>
> We are trying to use spark sql thrift server to execute some analysis
> queries against hive table.
>
> Without any changes in the configurations, we run the following query on
> both hive and spark sql thrift server
>
> *select * from tableName;*
>
> The time taken by spark is larger than the time taken by hive which is not
> supposed to be the like that.
>
> The hive table is mapped to json files stored on HDFS directory and we are
> using *org.openx.data.jsonserde.JsonSerDe* for
> serialization/deserialization.
>
> Why spark takes much more time to execute the query than hive ?
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Spark-sql-thrift-server-slower-than-
> hive-tp22177.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message