calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Mior <mm...@uwaterloo.ca>
Subject Re: Benchmarking Calcite - starting the conversation on the targets and design of the benchmark
Date Mon, 05 Feb 2018 14:26:31 GMT
One interesting exercise would also be to pick a popular benchmark (e.g.
TPC-H) and just look at the plan produced by Calcite vs existing RDBMS
optimizers (e.g. Postgres, MySQL). Along with performance analysis of the
various options, it seems there's a paper in there.

--
Michael Mior
mmior@apache.org

2018-02-03 23:21 GMT-05:00 Edmon Begoli <ebegoli@gmail.com>:

> I am planning on opening an issue, and coordinating an initiative to
> develop a Calcite-focused benchmark.
>
> This would lead to the development of the executable, reportable benchmark,
> and of the next publication aimed at another significant computer science
> conference or a journal.
>
> Before I submit a JIRA issue, i would like to get your feedback on what
> this benchmark might be both in terms of what it should benchmark, and now
> it should be implemented.
>
> Couple of preliminary thoughts that came out of the conversation with the
> co-authors of our SIGMOD paper are:
>
> * Optimizer runtime for complex queries (we could also compare with the
> runtime of executing the optimized query directly)
> * Calcite optimized query
> * Unoptimized query with the optimizer of the backend disabled
> * Unoptimized query with the optimizer of the backend enabled
> * Overhead of going through Calcite adapters vs. natively accessing the
> target DB
> * Comparison with other federated query processing engines such as Spark
> SQL and PrestoDB
> * use TCP-H or DS for this purpose
> * use Star Schema Benchmark (SSB)
> * Planning and execution time with queries that span across multiple
> systems (e.g. Postgres and Cassandra, Postgres and Pig, Pig and Cassandra).
>
>
>
> Follow approaches similar to:
> * https://www.slideshare.net/julianhyde/w-435phyde-3
> *
> https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/
> bk_hive-performance-tuning/content/ch_cost-based-optimizer.html
> * (How much of this is still relevant (Hive 0.14)? Can we use
> queries/benchmarks?)
> https://hortonworks.com/blog/hive-0-14-cost-based-optimizer-cbo-technical-
> overview/
>
>
> Please share your suggestions.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message