spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: spark sql performance
Date Fri, 13 Mar 2015 06:56:53 GMT
So you can cache upto 8GB of data in memory (hope your data size of one
table is < 2GB), then it should be pretty fast with SparkSQL. Also i'm
assuming you have around 12-16 cores total.

Thanks
Best Regards

On Fri, Mar 13, 2015 at 12:22 PM, Udbhav Agarwal <udbhav.agarwal@syncoms.com
> wrote:

>  Lets say am using 4 machines with 3gb ram. My data is customers records
> with 5 columns each in two tables with 0.5 million records. I want to
> perform join query on these two tables.
>
>
>
>
>
> *Thanks,*
>
> *Udbhav Agarwal*
>
>
>
> *From:* Akhil Das [mailto:akhil@sigmoidanalytics.com]
> *Sent:* 13 March, 2015 12:16 PM
> *To:* Udbhav Agarwal
> *Cc:* user@spark.apache.org
> *Subject:* Re: spark sql performance
>
>
>
> The size/type of your data, and your cluster configuration would be fine i
> think.
>
>
>   Thanks
>
> Best Regards
>
>
>
> On Fri, Mar 13, 2015 at 12:07 PM, Udbhav Agarwal <
> udbhav.agarwal@syncoms.com> wrote:
>
>  Thanks Akhil,
>
> What more info should I give so we can estimate query time in my scenario?
>
>
>
> *Thanks,*
>
> *Udbhav Agarwal*
>
>
>
> *From:* Akhil Das [mailto:akhil@sigmoidanalytics.com]
> *Sent:* 13 March, 2015 12:01 PM
> *To:* Udbhav Agarwal
> *Cc:* user@spark.apache.org
> *Subject:* Re: spark sql performance
>
>
>
> That totally depends on your data size and your cluster setup.
>
>
>   Thanks
>
> Best Regards
>
>
>
> On Thu, Mar 12, 2015 at 7:32 PM, Udbhav Agarwal <
> udbhav.agarwal@syncoms.com> wrote:
>
>  Hi,
>
> What is query time for join query on hbase with spark sql. Say tables in
> hbase have 0.5 million records each. I am expecting a query time (latency)
> in milliseconds with spark sql. Can this be possible ?
>
>
>
>
>
>
>
>
>
> *Thanks,*
>
> *Udbhav Agarwal*
>
>
>
>
>
>
>

Mime
View raw message