spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ozgun <>
Subject Re: Surprising Spark SQL benchmark
Date Mon, 03 Nov 2014 23:01:28 GMT
Hey Patrick,

It's Ozgun from Citus Data. We'd like to make these benchmark results fair,
and have tried different config settings for SparkSQL over the past month.
We picked the best config settings we could find, and also contacted the
Spark users list about running TPC-H numbers.

We also received advice at the Spark Summit '14 to wait until v1.1, and
therefore re-ran our tests on SparkSQL 1.1. On the specific optimizations,
Marco and Samay from our team have much more context, and I'll let them
answer your questions on the different settings we tried.

Our intent is to be fair and not misrepresent SparkSQL's performance. On
that front, we used publicly available documentation and user lists, and
spent about a month trying to get the best Spark performance results. If
there are specific optimizations we should have applied and missed, we'd
love to be involved with the community in re-running the numbers.

Is this email thread the best place to continue the conversation?


View this message in context:
Sent from the Apache Spark Developers List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message