Hi Stephan

Yes.  You are correct. It looks like the TPCx-HS is an industry standard for big data. But how to get a Flink number on that. 
I think it is also difficult to get a Spark performance number based on TPCx-HS.  
if you know someone can provide servers for performance testing.  I would like to put in my best efforts.  

That link is just for your reference. At least, you know the exact time them spent it when you run that queries. 
BigDataBench is a good guide for big data benchmark.  But how to run these user cases between Flink and Spark to get that performance number.

Thanks for sharing. if we can do some basic comparisons with Apache Spark.  The red line below will be going up fast. 


Inline image 1

On Mon, Jul 6, 2015 at 11:41 AM, Slim Baltagi <sbaltagi@gmail.com> wrote:

Vasia, thanks for sharing.
1. I would like to add a couple resources about *BigBench*, the Big Data
benchmark suite that you are referring to:
and also

2. *BigDataBench* is also an open source Big Data Benchmarking suite from
both industry and academia.  As a subset of BigDataBench, BigDataBench-DCA
is China’s first industry-standard big data benchmark suite:
It comes with *real-world data sets* and *many workloads*: TeraSort,
WordCount, PageRank, K-means, NaiveBayes, Aggregation and Read/Write/Scan
and also a *tool* that uses Hadoop, HBase and Mahout.
This might be inspiring to build a Big Data Benchmarking suite for Flink!


Slim Baltagi
Apache Flink Knowledge Base ( Now with over 300 categorized web resources!)

View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Benchmark-results-between-Flink-and-Spark-tp1940p1963.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.