drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Haynes <philip.hay...@virtualnation.com.au>
Subject Re: Drill development
Date Thu, 18 Oct 2012 03:15:13 GMT
Hey Christopher,

Not sure what floats your boat and the 100% relevance to the Drill project
(as this is for others to decide), but this is what I am doing. If you
could help out or at least keep me honest (my baby twins do distract) I
would appreciate it.

I am thinking that creating a LLVM interpreter that the Drill parser could
plug into could be quite a straight forward task (using Clang to help
figuring out the assembly). I find new Google supersonic is very
interesting and am keen to test out its performance and if it proves
adequate, kicking queries off via interpretive requests as above.

Now the Tables in supersonic seem to have Protocol Buffer (PB) like
structures. Having worked with PB a bit, it seems that one could inspect a
PB dataset to automagically populate supersonic tables that could then be
queried. Now I need a relevant data set to test this out.

Google's BigQuery team gave a presentation here last week. They used a
data set from Wikipedia
(https://developers.google.com/bigquery/docs/dataset-wikipedia). I thought
this might be a good test case for Drill and comparative performance data
would be useful no matter which way the project technically evolves.

So to execute the above, my steps were:
A) create a PB structure and then injector which reads the Wikipedia data
sets and converts it into PB format, and reports other ingestion
information to aid subsequent reads.
B) Write a C++ program that loads the above data set into supersonic and
tests out various supersonic queries  and their performance.
C) Do above with a java infrastructure to enable comparative performance.
D) Write a LLVM interpreter with interpretive script fragments to execute
the above.

If this of interest, then please let me know.


On 18/10/12 9:33 AM, "Christopher Bartos" <bartosenator@gmail.com> wrote:

>I've been interested in Drill for awhile. I was looking at Big Data / Big
>Query for some time
>for my job and that's when I stumbled upon Drill. Now that some
>development is underway
>I would love to contribute.
>I work as a programmer and web developer with a CS degree and in dire
>need of a side project
>to work on that interests me.
>How does one get started? Since, development seems to have just begun,
>there is probably
>not a whole lot I'd be able to do. But, maybe there is.
>Please let me know!
>Christopher Bartos
>Columbus, OH
>(330) 324-0018

View raw message