drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andries Engelbrecht <aengelbre...@maprtech.com>
Subject Re: Regarding drill jdbc with big file
Date Fri, 28 Aug 2015 14:56:28 GMT
I also commented on the JIRA.

How much memory is available on the system for Drill?

Also see what happens when you increase the planner query memory on the node, as the files
are large and will execute in a single thread. Normally it is better to have JSON files in
the 128-256MB range size pending the use case, as it will allow for better execution with
more threads than a single large file.

See what the query memory per node is set at and increase it to see if it resolves your problem.
The parameter is planner.memory.max_query_memory_per_node
Query sys.options to see what it is set as and use alter system to modify.
https://drill.apache.org/docs/configuring-drill-memory/ <https://drill.apache.org/docs/configuring-drill-memory/>
https://drill.apache.org/docs/alter-system/ <https://drill.apache.org/docs/alter-system/>
https://drill.apache.org/docs/configuration-options-introduction/ <https://drill.apache.org/docs/configuration-options-introduction/>

—Andries


> On Aug 28, 2015, at 7:00 AM, rahul challapalli <challapallirahul@gmail.com> wrote:
> 
> Can you search for the error id in the logs and post the stack trace?
> 
> It looks like an overflow bug to me.
> 
> - Rahul
> On Aug 28, 2015 6:47 AM, "Kunal Ghosh" <kunal.g@icedq.com> wrote:
> 
>> Hi,
>> 
>> I am new to apache drill. I have configured apache drill on machine with
>> centos.
>> 
>> "DRILL_MAX_DIRECT_MEMORY" = 25g
>> "DRILL_HEAP" = 4g
>> 
>> I have a 600 mb and 3 gb json file [sample file attached]. When i fire
>> query on relativly small size file everything works fine but as I fire same
>> query with 600 mb and 3 gb files it gives following error.
>> 
>> Query -
>> select tbl5.product_id product_id,tbl5.gender gender,tbl5.item_number
>> item_number,tbl5.price price,tbl5.description
>> description,tbl5.color_swatch.image image,tbl5.color_swatch.color color from
>> (select tbl4.product_id product_id,tbl4.gender gender,tbl4.item_number
>> item_number,tbl4.price price,tbl4.size.description
>> description,FLATTEN(tbl4.size.color_swatch) color_swatch from
>> (select tbl3.product_id product_id,tbl3.catalog_item.gender
>> gender,tbl3.catalog_item.item_number item_number,tbl3.catalog_item.price
>> price,FLATTEN(tbl3.catalog_item.size) size from
>> (select tbl2.product.product_id as
>> product_id,FLATTEN(tbl2.product.catalog_item) as catalog_item from
>> (select FLATTEN(tbl1.catalog.product) product from dfs.root.`demo.json`
>> tbl1) tbl2) tbl3) tbl4) tbl5
>> 
>> --------------------------------------------------------------------------------------------------
>> Error -
>> 
>> SYSTEM ERROR: IllegalArgumentException: initialCapacity: -2147483648
>> (expectd: 0+)
>> 
>> Fragment 0:0
>> 
>> [Error Id: 60cf1b95-762d-4a0d-8cae-a2db418d4ea9 on sinhagad:31010]
>> 
>> 
>> --------------------------------------------------------------------------------------------------
>> 
>> 1) Am i doing someting wrong or missing something ( probably because i am
>> not using cluster ?? ).
>> 
>> Please guide me through this.
>> 
>> Thanks & Regards
>> 
>> Kunal Ghosh
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message