kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: CDH 5.5 - Kudu error not enough space remaining in buffer for op
Date Wed, 18 May 2016 17:47:37 GMT
What are the types of your 1000 columns? Maybe an even smaller batch size
is necessary.

-Todd

On Wed, May 18, 2016 at 10:41 AM, Abhi Basu <9000revs@gmail.com> wrote:

> I have tried with batch_size=500 and still get same error. For your
> reference are attached info that may help diagnose.
>
> Error: Error while applying Kudu session.: Incomplete: not enough space
> remaining in buffer for op (required 46.7K, 7.00M already used
>
>
> Config settings:
>
> Kudu Tablet Server Block Cache Capacity   1 GB
> Kudu Tablet Server Hard Memory Limit  16 GB
>
>
> On Wed, May 18, 2016 at 8:26 AM, William Berkeley <wdberkeley@cloudera.com
> > wrote:
>
>> Both options are more or less the same idea- the point is you need less
>> rows going in per batch so you don't go over the batch size limit. Follow
>> what Todd said as he explained it more clearly and suggested a better way.
>>
>> -Will
>>
>> On Wed, May 18, 2016 at 10:45 AM, Abhi Basu <9000revs@gmail.com> wrote:
>>
>>> Thanks for the updates. I will give both options a try and report back.
>>>
>>> If you are interested in testing with such datasets, I can help.
>>>
>>> Thanks,
>>>
>>> Abhi
>>>
>>> On Wed, May 18, 2016 at 6:25 AM, Todd Lipcon <todd@cloudera.com> wrote:
>>>
>>>> Hi Abhi,
>>>>
>>>> Will is right that the error is client-side, and probably happening
>>>> because your rows are so wide.Impala typically will batch 1000 rows at a
>>>> time when inserting into Kudu, so if each of your rows is 7-8KB, that will
>>>> overflow the max buffer size that Will mentioned. This seems quite probable
>>>> if your data is 1000 columns of doubles or int64s (which are 8 bytes each).
>>>>
>>>> I don't think his suggested workaround will help, but you can try
>>>> running 'set batch_size=500' before running the create table or insert
>>>> query.
>>>>
>>>> In terms of max supported columns, most of the workloads we are
>>>> focusing on are more like typical data-warehouse tables, on the order of
a
>>>> couple hundred columns. Crossing into the 1000+ range enters "uncharted
>>>> territory" where it's much more likely you'll hit problems like this and
>>>> quite possibly others as well. Will be interested to hear your experiences,
>>>> though you should probably be prepared for some rough edges.
>>>>
>>>> -Todd
>>>>
>>>> On Tue, May 17, 2016 at 8:32 PM, William Berkeley <
>>>> wdberkeley@cloudera.com> wrote:
>>>>
>>>>> Hi Abhi.
>>>>>
>>>>> I believe that error is actually coming from the client, not the
>>>>> server. See e,g,
>>>>> https://github.com/apache/incubator-kudu/blob/master/src/kudu/client/batcher.cc#L787
(NB
>>>>> that link is to master branch not the exact release you are using).
>>>>>
>>>>> If you look around there, you'll see that the max is set by something
>>>>> called max_buffer_size_, which appears to be hardcoded to 7 * 1024 *
1024
>>>>> bytes = 7MiB (and this is consistent with 6.96 + 0.0467 > 7).
>>>>>
>>>>> I think the simple workaround would be to do the CTAS as a CTAS +
>>>>> insert as select. Pick a condition that bipartitions the table, so you
>>>>> don't get errors trying to double insert rows.
>>>>>
>>>>> -Will
>>>>>
>>>>> On Tue, May 17, 2016 at 4:45 PM, Abhi Basu <9000revs@gmail.com>
wrote:
>>>>>
>>>>>> What is the limit of columns in Kudu?
>>>>>>
>>>>>> I am using 1000 gen dataset, specifically the chr22 table which has
>>>>>> 500,000 rows x 1101 columns. This table has been built In Impala/HDFS.
I am
>>>>>> trying to create a new Kudu table as select from that table. I get
the
>>>>>> following error:
>>>>>>
>>>>>> Error while applying Kudu session.: Incomplete: not enough space
>>>>>> remaining in buffer for op (required 46.7K, 6.96M already used
>>>>>>
>>>>>> When looking at http://pcsd-cdh2.local.com:8051/mem-trackers, I see
>>>>>> the following. What configuration needs to be tweaked?
>>>>>>
>>>>>>
>>>>>> Memory usage by subsystem
>>>>>> IdParentLimitCurrent ConsumptionPeak consumption
>>>>>> root none 50.12G 4.97M 6.08M
>>>>>> block_cache-sharded_lru_cache root none 937.9K 937.9K
>>>>>> code_cache-sharded_lru_cache root none 1B 1B
>>>>>> server root none 2.3K 201.4K
>>>>>> tablet-00000000000000000000000000000000 server none 530B 200.1K
>>>>>> MemRowSet-6 tablet-00000000000000000000000000000000 none 265B 265B
>>>>>> txn_tracker tablet-00000000000000000000000000000000 64.00M 0B 28.5K
>>>>>> DeltaMemStores tablet-00000000000000000000000000000000 none 265B
>>>>>> 87.8K
>>>>>> log_block_manager server none 1.8K 2.7K
>>>>>>
>>>>>> Thanks,
>>>>>> --
>>>>>> Abhi Basu
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Todd Lipcon
>>>> Software Engineer, Cloudera
>>>>
>>>
>>>
>>>
>>> --
>>> Abhi Basu
>>>
>>
>>
>
>
> --
> Abhi Basu
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message