calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacky Li <>
Subject Re: how about integrate spark dataset/dataframe api
Date Mon, 05 Sep 2016 13:39:54 GMT
Hi Julian,

By translating the calcite SQL into spark dataframe/dataset API, the benefit I see is that
it provides a unified SQL layer for computation framework like spark and flink, so that user
writes their SQL statement and be executed by any computation framework whose API can be translated
from calcite’s logical plan.  

Another potential benefit is that it can enable some optimization that require across-tables/view
manipulations. Actually in CarbonData community we are evaluating this approach for roadmap

I want to understand more on what should be done to translate calcite logical plan to dataframe/dataset
API, and if I understand flink Table API correctly, what we need is something similar to package
org.apache.flink.api.table.plan.nodes.dataset (extension of RelNode and corresponding translation).
Am I correct?


> 在 2016年9月5日,上午10:38,Wangfei (X) <> 写道:
> IMO the main benefit is to inherit the optimization of spark SQL(such as whole stage
codegen, memory management, maybe vectorized execution in future ...).
> not farmilar with calcite's codegen mechanism, any reference about it?  I think firstly
i will unsterstand how the spark adapter now works and then see what i can do .
> Fei
>>    *From:* Julian Hyde <>
>>    *Date:* 2016-09-05 05:34
>>    *To:* <>
>>    *Subject:* Re: how about integrate spark dataset/dataframe api
>>    It’s an interesting idea. I know that the data frame API is easier
>>    to work with for application developers, but since Calcite would
>>    be generating the code, can you describe the benefits to the
>>    Calcite user of changing the integration point?
>>    It’s definitely true that Calcite’s Spark adapter needs some love.
>>    If someone would like to rework the adapter in terms of the data
>>    frame API and get it working on more cases, and more reliably, I
>>    would definitely welcome it.
>>    Julian
>>    > On Sep 1, 2016, at 8:35 PM, Wangfei (X) <> wrote:
>>    >
>>    > Hi, community
>>    >      I noticed that now the spark adapter in calcite is
>>    integrated with spark core api, since now the dataset/dataframe
>>    api become the top level api, how about integrate the
>>    dataset/dataframe api ? or is it possible to do that?
>>    >
>>    > Fei.
>>    >

View raw message