spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: Benchmark numbers for terabytes of data
Date Wed, 04 Dec 2013 18:53:12 GMT
Yes, check out the Shark paper for example: https://amplab.cs.berkeley.edu/publication/shark-sql-and-rich-analytics-at-scale/

The numbers on that benchmark are for Shark.

Matei

On Dec 3, 2013, at 3:50 PM, Matt Cheah <mcheah@palantir.com> wrote:

> Hi everyone,
> 
> I notice the benchmark page for AMPLab provides some numbers on Gbs of data: https://amplab.cs.berkeley.edu/benchmark/
I was wondering if similar benchmark numbers existed for even larger data sets, in the terabytes
if possible.
> 
> Also, are there any for just raw spark, i.e. No shark?
> 
> Thanks,
> 
> -Matt Chetah


Mime
View raw message