calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Masayuki Takahashi <masayuki...@gmail.com>
Subject Re: Gandiva
Date Thu, 02 Aug 2018 15:54:55 GMT
Sorry for very late reply.

I am watching Gandiva repository in github. "Filter" has just been
implemented few days ago. But "Aggregation" seems still to be not.

After "Aggregation" has been implemented, I want to start to implement
arrow adapter using Gandiva.

Currently, I think the Filter of Arrow Adapter implementation as below:

1. From the condition of Calcite Filter, extracting the field type and
comparation operator.
2. Using them, Arrow Filter generate Java code that calling Gandiva
API to generate llvm code.
3. From these results of executions, extracting Selection Vector to
Java side  generated by Gandiva side.

Finally, I want to remove ArrowRexToLixTranslator in Arrow Adapter.

thanks.
2018年7月1日(日) 6:35 Walaa Eldin Moustafa <wa.moustafa@gmail.com>:
>
> Hi Julian and Masayuki,
>
> This indeed sounds quite important. Masayuki, thanks for taking the
> initiative. I would like to do I what I can to help. I can help with
> writing some of the operators, UDFs/UDF APIs, and integration with Calcite.
>
> Thanks,
> Walaa.
>
>
> On Fri, Jun 29, 2018 at 11:40 AM Julian Hyde <jhyde@apache.org> wrote:
>
> > We already have two JIRA cases for Arrow integration:
> > https://issues.apache.org/jira/browse/CALCITE-2040 and
> > https://issues.apache.org/jira/browse/CALCITE-2173.
> >
> > I think this is an extremely important area of work for the Calcite
> > project, because it helps us realize the vision of a deconstructed
> > database[1]. There is a lot of work to do, much of it very interesting
> > (e.g. writing a thread scheduler, IPC mechanisms, and algorithms for
> > sort, join and aggregation that work effectively on Arrow data
> > structures).
> >
> > If you want to help Masayuki, please step up!
> >
> > Julian
> >
> > [1]
> > https://www.slideshare.net/julienledem/from-flat-files-to-deconstructed-database
> >
> > On Thu, Jun 28, 2018 at 2:24 PM, Michael Mior <mmior@apache.org> wrote:
> > > That's great! If you could create a JIRA case to track your progress,
> > that
> > > would be helpful for others who might want to follow along or contribute.
> > > Thanks!
> > >
> > > --
> > > Michael Mior
> > > mmior@apache.org
> > >
> > >
> > >
> > > Le mar. 26 juin 2018 à 10:36, Masayuki Takahashi <masayuki038@gmail.com>
> > a
> > > écrit :
> > >
> > >> Hi Julian,
> > >>
> > >> > Masayuki Takahashi has started to develop an Arrow adapter for
> > >> Calcite[2], but a lot of work remains to implement all SQL built-in
> > >> functions and basic relational operators. Building on top of Gandiva we
> > >> could save a lot of this effort.
> > >>
> > >> I will start to build Gandiva development environment and try to
> > >> consider a way to incorporate.
> > >>
> > >> thanks.
> > >>
> > >>
> > >>
> > >> 2018年6月23日(土) 3:54 Julian Hyde <jhyde@apache.org>:
> > >> >
> > >> > Suppose a company wishes to build a graph database using their own
> > >> innovative graph index data structure. They nevertheless need to
> > implement
> > >> core relational algebra, core data types, and core built-in functions
> > (+,
> > >> CASE, SUM, SUBSTRING). And they want to implement these on a
> > >> memory-efficient data structure (tens of thousands of rows, stored
> > >> column-oriented, per memory block). This is a massive effort.
> > >> >
> > >> > With Calcite+Gandiva+Arrow they just need to create a sequence of
> > >> relational operators (using RelBuilder, say) and efficient machine code
> > is
> > >> generated. They can then start adding their own data types, built-in
> > >> functions, and relational operators, using the same architecture.
> > >> >
> > >> > Julian
> > >> >
> > >> >
> > >> > > On Jun 22, 2018, at 11:33 AM, Xiening Dai <xndai.git@live.com>
> > wrote:
> > >> > >
> > >> > > I was in a talk regarding Gandiva yesterday. Impressive work!
> > >> > >
> > >> > > But I am not sure why Calcite would like to integrate with it.
To me
> > >> Gandiva is on execution side, in which scenarios a query planner would
> > need
> > >> a arrow engine? I read the original Jira about implementing file
> > >> enumerator, but the intent is still not clear to me. Would appreciate if
> > >> you can elaborate. Thanks.
> > >> > >
> > >> > >
> > >> > >> On Jun 22, 2018, at 11:20 AM, Julian Hyde <jhyde@apache.org>
> > wrote:
> > >> > >>
> > >> > >> There is a discussion on dev@arrow about Gandiva, a kernel
for
> > >> Arrow[1].
> > >> > >>
> > >> > >> I think it would be an interesting library on which to build
our
> > >> Arrow engine. (Without a kernel, Arrow is just a data format, but with
> > >> Gandiva it becomes an engine upon which we can implement all relational
> > >> operations, albeit on a multi-threaded single node. Potentially this
> > >> approach can process each row in a few machine cycles, i.e. billions of
> > >> records per second. Therefore single-node would be sufficient for many
> > >> queries.)
> > >> > >>
> > >> > >> Masayuki Takahashi has started to develop an Arrow adapter
for
> > >> Calcite[2], but a lot of work remains to implement all SQL built-in
> > >> functions and basic relational operators. Building on top of Gandiva we
> > >> could save a lot of this effort.
> > >> > >>
> > >> > >> Julian
> > >> > >>
> > >> > >> [1]
> > >>
> > https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E
> > >> <
> > >>
> > https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E
> > >> >
> > >> > >>
> > >> > >> [2] https://issues.apache.org/jira/browse/CALCITE-2173 <
> > >> https://issues.apache.org/jira/browse/CALCITE-2173>
> > >> > >
> > >> >
> > >>
> > >>
> > >> --
> > >> 高橋 真之
> > >>
> >



-- 
高橋 真之

Mime
View raw message