kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhi Basu <9000r...@gmail.com>
Subject Re: CDH 5.5 - Kudu error not enough space remaining in buffer for op
Date Wed, 18 May 2016 14:45:55 GMT
Thanks for the updates. I will give both options a try and report back.

If you are interested in testing with such datasets, I can help.

Thanks,

Abhi

On Wed, May 18, 2016 at 6:25 AM, Todd Lipcon <todd@cloudera.com> wrote:

> Hi Abhi,
>
> Will is right that the error is client-side, and probably happening
> because your rows are so wide.Impala typically will batch 1000 rows at a
> time when inserting into Kudu, so if each of your rows is 7-8KB, that will
> overflow the max buffer size that Will mentioned. This seems quite probable
> if your data is 1000 columns of doubles or int64s (which are 8 bytes each).
>
> I don't think his suggested workaround will help, but you can try running
> 'set batch_size=500' before running the create table or insert query.
>
> In terms of max supported columns, most of the workloads we are focusing
> on are more like typical data-warehouse tables, on the order of a couple
> hundred columns. Crossing into the 1000+ range enters "uncharted territory"
> where it's much more likely you'll hit problems like this and quite
> possibly others as well. Will be interested to hear your experiences,
> though you should probably be prepared for some rough edges.
>
> -Todd
>
> On Tue, May 17, 2016 at 8:32 PM, William Berkeley <wdberkeley@cloudera.com
> > wrote:
>
>> Hi Abhi.
>>
>> I believe that error is actually coming from the client, not the server.
>> See e,g,
>> https://github.com/apache/incubator-kudu/blob/master/src/kudu/client/batcher.cc#L787
(NB
>> that link is to master branch not the exact release you are using).
>>
>> If you look around there, you'll see that the max is set by something
>> called max_buffer_size_, which appears to be hardcoded to 7 * 1024 * 1024
>> bytes = 7MiB (and this is consistent with 6.96 + 0.0467 > 7).
>>
>> I think the simple workaround would be to do the CTAS as a CTAS + insert
>> as select. Pick a condition that bipartitions the table, so you don't get
>> errors trying to double insert rows.
>>
>> -Will
>>
>> On Tue, May 17, 2016 at 4:45 PM, Abhi Basu <9000revs@gmail.com> wrote:
>>
>>> What is the limit of columns in Kudu?
>>>
>>> I am using 1000 gen dataset, specifically the chr22 table which has
>>> 500,000 rows x 1101 columns. This table has been built In Impala/HDFS. I am
>>> trying to create a new Kudu table as select from that table. I get the
>>> following error:
>>>
>>> Error while applying Kudu session.: Incomplete: not enough space
>>> remaining in buffer for op (required 46.7K, 6.96M already used
>>>
>>> When looking at http://pcsd-cdh2.local.com:8051/mem-trackers, I see the
>>> following. What configuration needs to be tweaked?
>>>
>>>
>>> Memory usage by subsystem
>>> IdParentLimitCurrent ConsumptionPeak consumption
>>> root none 50.12G 4.97M 6.08M
>>> block_cache-sharded_lru_cache root none 937.9K 937.9K
>>> code_cache-sharded_lru_cache root none 1B 1B
>>> server root none 2.3K 201.4K
>>> tablet-00000000000000000000000000000000 server none 530B 200.1K
>>> MemRowSet-6 tablet-00000000000000000000000000000000 none 265B 265B
>>> txn_tracker tablet-00000000000000000000000000000000 64.00M 0B 28.5K
>>> DeltaMemStores tablet-00000000000000000000000000000000 none 265B 87.8K
>>> log_block_manager server none 1.8K 2.7K
>>>
>>> Thanks,
>>> --
>>> Abhi Basu
>>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Abhi Basu

Mime
View raw message