drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kunal Khatua <kkha...@mapr.com>
Subject Re: Increasing store.parquet.block-size
Date Fri, 09 Jun 2017 17:33:49 GMT

There are some interesting problems when using Parquet files > 2GB on HDFS.

If I'm not mistaken, the HDFS APIs that allow you to read offsets (oddly enough) returns an
int value. Large Parquet blocksize also means you'll end up having the file span across multiple
HDFS blocks, and that would make reading of rowgroups inefficient.

Is there a reason you want to create such a large parquet file?

~ Kunal

From: Vitalii Diravka <vitalii.diravka@gmail.com>
Sent: Friday, June 9, 2017 4:49:02 AM
To: user@drill.apache.org
Subject: Re: Increasing store.parquet.block-size


DRILL-2478 is a good place holder for the LongValidator issue, it really
works wrong.

But other issue connected to impossibility to use long values for parquet
This issue can be independent task or a sub-task of updating Drill project
to a latest parquet library.

Kind regards

On Fri, Jun 9, 2017 at 10:25 AM, Khurram Faraaz <kfaraaz@mapr.com> wrote:

>   1.  DRILL-2478<https://issues.apache.org/jira/browse/DRILL-2478> is
> Open for this issue.
>   2.  I have added more details into the comments.
> Thanks,
> Khurram
> ________________________________
> From: Shuporno Choudhury <shuporno.choudhury@manthan.com>
> Sent: Friday, June 9, 2017 12:48:41 PM
> To: user@drill.apache.org
> Subject: Increasing store.parquet.block-size
> The max value that can be assigned to *store.parquet.block-size *is
> *2147483647*, as the value kind of this configuration parameter is LONG.
> This basically translates to 2GB of block size.
> How do I increase it to 3/4/5 GB ?
> Trying to set this parameter to a higher value using the following command
> actually succeeds :
>     ALTER SYSTEM SET `store.parquet.block-size` = 4294967296;
> But when I try to run a query that uses this config, it throws the
> following error:
>    Error: SYSTEM ERROR: NumberFormatException: For input string:
> "4294967296"
> So, is it possible to assign a higher value to this parameter?
> --
> Regards,
> Shuporno Choudhury

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message