drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Grove <andygrov...@gmail.com>
Subject Re: Looking for advice on integrating with a custom data source
Date Wed, 15 Jan 2020 02:52:42 GMT
I'm now working on predicate push down ... I have a filter rule that is
correctly extracting the predicates that the backend database supports and
I am creating a new GroupScan containing these predicates, using the Kafka
plugin as a reference. I see the GroupScan constructor being called after
this, with the predicates populated So far so good ... but then I see calls
to getDigest, getScanStats, and getNewWithChildren, and then I see calls to
the GroupScan constructor with the predicates missing.

Any pointers on what I might be missing? Is there more magic I need to know?

Thanks!

On Sun, Jan 12, 2020 at 5:34 PM Paul Rogers <par0328@yahoo.com.invalid>
wrote:

> Hi Andy,
>
> Congrats! You are making good progress. Yes, the BatchCreator is a bit of
> magic: Drill looks for a subclass that has your SubScan subclass as the
> second parameter. Looks like you figured that out.
>
> Thanks,
> - Paul
>
>
>
>     On Sunday, January 12, 2020, 1:45:16 PM PST, Andy Grove <
> andygrove73@gmail.com> wrote:
>
>  Actually I managed to get past that error with an educated guess that if I
> created a BatchCreator class, it would automagically be picked up somehow.
> I'm now at the point where my RecordReader is being invoked!
>
> On Sun, Jan 12, 2020 at 2:03 PM Andy Grove <andygrove73@gmail.com> wrote:
>
> > Between reading the tutorial and copying and pasting code from the Kudu
> > storage plugin, I've been making reasonable progress with this but am I
> but
> > confused by one error I'm now hitting.
> > ExecutionSetupException: Failure finding OperatorCreator constructor for
> > config com.mydb.MyDbSubScan
> > Prior to this, Drill had called getSpecificScan and then called a few of
> > the methods on my subscan object. I wasn't sure what to return for
> > getOperatorType so just returned the kudu subscan operator type and I'm
> > wondering if the issue is related to that somehow?
> >
> > Thanks.
> >
> >
> > On Sat, Jan 11, 2020 at 10:13 PM Andy Grove <andygrove73@gmail.com>
> wrote:
> >
> >> Thank you both for the those responses. This is very helpful. I have
> >> ordered a copy of the book too. I'm using Drill 1.17.0.
> >>
> >> I'll take a look at the Jdbc Storage Plugin code and see if it would be
> >> feasible to add the logic I need there. In parallel, I've started
> >> implementing a new storage plugin. I'll be working on this more tomorrow
> >> and I'm sure I'll be back with more questions soon.
> >>
> >> Thanks again for your help!
> >>
> >> Andy.
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Sat, Jan 11, 2020 at 6:03 PM Charles Givre <cgivre@gmail.com> wrote:
> >>
> >>> HI Andy,
> >>> Thanks for your interest in Drill.  I'm glad to see that Paul wrote you
> >>> back as well.  I was going to say I thought the JDBC storage plugin
> did in
> >>> fact push down columns and filters to the source system.
> >>>
> >>> Also, what version of Drill are you using?
> >>>
> >>> Writing a storage plugin for Drill is not trivial and I'd definitely
> >>> recommend using the code from Paul's PR as that greatly simplifies
> things.
> >>> Here is a tutorial as well:
> >>> https://github.com/paul-rogers/drill/wiki/Create-a-Storage-Plugin
> >>>
> >>> If you need additional help, please let us know.
> >>> -- C
> >>>
> >>>
> >>> On Jan 11, 2020, at 5:57 PM, Andy Grove <andygrove73@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I'd like to use Apache Drill with a custom data source that supports a
> >>> subset of SQL.
> >>>
> >>> My goal is to have Drill push selection and predicates down to my data
> >>> source but the rest of the query processing should take place in Drill.
> >>>
> >>> I started out by writing a JDBC driver for the data source and
> >>> registering
> >>> that with Drill using the Jdbc Storage Plugin but it seems to just pass
> >>> the
> >>> whole query through to my data source, so that approach isn't going to
> >>> work
> >>> unless I'm missing something?
> >>>
> >>> Is there any way to configure the JDBC storage plugin to only push
> >>> certain
> >>> parts of the query to the data source?
> >>>
> >>> If this isn't a good approach, do I need to write a custom storage
> >>> plugin?
> >>> Can these be added on the classpath or would that require me
> maintaining
> >>> a
> >>> fork of the project?
> >>>
> >>>
> >>>
> >>> I appreciate any pointers anyone can give me.
> >>>
> >>> Thanks,
> >>>
> >>> Andy.
> >>>
> >>>
> >>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message