spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sachin Singh <sachin.sha...@gmail.com>
Subject Re: Spark performance in cluster mode using yarn
Date Fri, 15 May 2015 06:14:34 GMT
Hi Ayan,
I am asking general scenarios as per given info/configuration, from
experts, not specific,
java code is nothing get hive context and select query,
there is no serialization or any other complex things I kept,straight
forward, 10 lines of code,
Group Please suggest if any Idea,

Regards
Sachin

On Fri, May 15, 2015 at 6:57 AM, ayan guha <guha.ayan@gmail.com> wrote:

> With this information it is hard to predict. What's the performance you
> are getting? What's your desired performance? Maybe you can post your code
> and experts can suggests improvement?
> On 14 May 2015 15:02, "sachin Singh" <sachin.shashi@gmail.com> wrote:
>
>> Hi Friends,
>> please someone can give the idea, Ideally what should be time(complete job
>> execution) for spark job,
>>
>> I have data in a hive table, amount of data would be 1GB , 2 lacs rows for
>> whole month,
>> I want to do monthly aggregation, using SQL queries,groupby
>>
>> I have only one node,1 cluster,below configuration for running job,
>> --num-executors 2 --driver-memory 3g --driver-java-options
>> "-XX:MaxPermSize=1G" --executor-memory 2g --executor-cores 2
>>
>> how much approximate time require to finish the job,
>>
>> or can someone suggest the best way to get quickly results,
>>
>> Thanks in advance,
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-performance-in-cluster-mode-using-yarn-tp22877.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>

Mime
View raw message