drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Omernik <j...@omernik.com>
Subject Re: Batch load of unstructured data in Drill
Date Thu, 08 Dec 2016 17:31:36 GMT
Sure... I believe you could CTAS from your json directory into a tmp
parquet directory and then move the resultant files into the final parquet
directory

i.e.

Drill Query: Create table `.mytempparq` as select * from `.mytempjson`
Filesystem command: mv ./mytempparq/* ./myfinalparq

It would be great of Drill could do this for us :)



On Thu, Dec 8, 2016 at 11:13 AM, Alexander Reshetov <
alexander.v.reshetov@gmail.com> wrote:

> By the way, is it possible to append data to parquet data source?
> I'm looking for possibility to update (append to) existing data new
> rows so every query execution will have new data rows.
>
> Surely it's possible with plain JSON, but I want more efficient binary
> format which will give quicker reads (and executions of queries).
>
> On Wed, Dec 7, 2016 at 4:08 PM, Alexander Reshetov
> <alexander.v.reshetov@gmail.com> wrote:
> > Hello,
> >
> > I want to load batches of unstructured data in Drill. Mostly JSON data.
> >
> > Is there any batch API or other options to do so?
> >
> >
> > Thanks.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message