drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Divya Gehlot <divya.htco...@gmail.com>
Subject Re: Which perform better JSON or convert JSON to parquet format ?
Date Tue, 12 Jun 2018 12:25:09 GMT
Hi David,
How to create the schema first using parquet library ?
Can you please give an example?

Thanks,
Divya

On Tue, 12 Jun 2018 at 00:03, Lee, David <David.Lee@blackrock.com> wrote:

> Parquet is faster especially if you are only looking for a subset of json
> objects. Every JSON key / array is treated as a column.
>
> With that said creating parquet from JSON is not bullet proof if you have
> really complex json which may have NULL values or many optional keys (Drill
> can't figure out what data type a NULL JSON value is and has trouble
> merging optional keys after sampling the first 20,000? records)
>
> If you are creating parquet you should be using the parquet libraries to
> define a consistent schema first. I've pretty much given up trying to
> create parquet from json which always ends in index out of bound (server
> crashing) errors when trying to query parquet.
>
> -----Original Message-----
> From: Ted Dunning [mailto:ted.dunning@gmail.com]
> Sent: Monday, June 11, 2018 4:47 AM
> To: user <user@drill.apache.org>
> Subject: Re: Which perform better JSON or convert JSON to parquet format ?
>
> [EXTERNAL EMAIL]
>
>
> Yes. Drill is good at JSON.
>
> But Parquet will be faster during a scan.
>
> Faster may be better. Or other things may be more important.
>
> You have to decide what is important to you. The great virtue of drill is
> that you have the choice.
>
>
>
> On Mon, Jun 11, 2018 at 11:06 AM Divya Gehlot <divya.htconex@gmail.com>
> wrote:
>
> > Thanks to all for  your opinions !
> > As Drill has been popularised  as complex JSON reader as compare to
> > other tools in space .
> > Was wondering does drill works better for JSON rather than parquet.
> >
>
>
> This message may contain information that is confidential or privileged.
> If you are not the intended recipient, please advise the sender immediately
> and delete this message. See
> http://www.blackrock.com/corporate/en-us/compliance/email-disclaimers for
> further information.  Please refer to
> http://www.blackrock.com/corporate/en-us/compliance/privacy-policy for
> more information about BlackRock’s Privacy Policy.
>
> For a list of BlackRock's office addresses worldwide, see
> http://www.blackrock.com/corporate/en-us/about-us/contacts-locations.
>
> © 2018 BlackRock, Inc. All rights reserved.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message