drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Girish <abhishek.gir...@gmail.com>
Subject Re: Regarding drill jdbc with big file
Date Fri, 28 Aug 2015 15:29:14 GMT
This looks similar to DRILL-2882
<https://issues.apache.org/jira/browse/DRILL-2882>.

On Fri, Aug 28, 2015 at 7:56 AM, Andries Engelbrecht <
aengelbrecht@maprtech.com> wrote:

> I also commented on the JIRA.
>
> How much memory is available on the system for Drill?
>
> Also see what happens when you increase the planner query memory on the
> node, as the files are large and will execute in a single thread. Normally
> it is better to have JSON files in the 128-256MB range size pending the use
> case, as it will allow for better execution with more threads than a single
> large file.
>
> See what the query memory per node is set at and increase it to see if it
> resolves your problem.
> The parameter is planner.memory.max_query_memory_per_node
> Query sys.options to see what it is set as and use alter system to modify.
> https://drill.apache.org/docs/configuring-drill-memory/ <
> https://drill.apache.org/docs/configuring-drill-memory/>
> https://drill.apache.org/docs/alter-system/ <
> https://drill.apache.org/docs/alter-system/>
> https://drill.apache.org/docs/configuration-options-introduction/ <
> https://drill.apache.org/docs/configuration-options-introduction/>
>
> —Andries
>
>
> > On Aug 28, 2015, at 7:00 AM, rahul challapalli <
> challapallirahul@gmail.com> wrote:
> >
> > Can you search for the error id in the logs and post the stack trace?
> >
> > It looks like an overflow bug to me.
> >
> > - Rahul
> > On Aug 28, 2015 6:47 AM, "Kunal Ghosh" <kunal.g@icedq.com> wrote:
> >
> >> Hi,
> >>
> >> I am new to apache drill. I have configured apache drill on machine with
> >> centos.
> >>
> >> "DRILL_MAX_DIRECT_MEMORY" = 25g
> >> "DRILL_HEAP" = 4g
> >>
> >> I have a 600 mb and 3 gb json file [sample file attached]. When i fire
> >> query on relativly small size file everything works fine but as I fire
> same
> >> query with 600 mb and 3 gb files it gives following error.
> >>
> >> Query -
> >> select tbl5.product_id product_id,tbl5.gender gender,tbl5.item_number
> >> item_number,tbl5.price price,tbl5.description
> >> description,tbl5.color_swatch.image image,tbl5.color_swatch.color color
> from
> >> (select tbl4.product_id product_id,tbl4.gender gender,tbl4.item_number
> >> item_number,tbl4.price price,tbl4.size.description
> >> description,FLATTEN(tbl4.size.color_swatch) color_swatch from
> >> (select tbl3.product_id product_id,tbl3.catalog_item.gender
> >> gender,tbl3.catalog_item.item_number item_number,tbl3.catalog_item.price
> >> price,FLATTEN(tbl3.catalog_item.size) size from
> >> (select tbl2.product.product_id as
> >> product_id,FLATTEN(tbl2.product.catalog_item) as catalog_item from
> >> (select FLATTEN(tbl1.catalog.product) product from dfs.root.`demo.json`
> >> tbl1) tbl2) tbl3) tbl4) tbl5
> >>
> >>
> --------------------------------------------------------------------------------------------------
> >> Error -
> >>
> >> SYSTEM ERROR: IllegalArgumentException: initialCapacity: -2147483648
> >> (expectd: 0+)
> >>
> >> Fragment 0:0
> >>
> >> [Error Id: 60cf1b95-762d-4a0d-8cae-a2db418d4ea9 on sinhagad:31010]
> >>
> >>
> >>
> --------------------------------------------------------------------------------------------------
> >>
> >> 1) Am i doing someting wrong or missing something ( probably because i
> am
> >> not using cluster ?? ).
> >>
> >> Please guide me through this.
> >>
> >> Thanks & Regards
> >>
> >> Kunal Ghosh
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message