kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Bad insert performance of java kudu-client
Date Mon, 24 Apr 2017 19:29:58 GMT
I tried to reproduce this locally using your code and couldn't. I get
around 100K inserts/second for 1.0, 1.1, 1.2, and 1.3 clients (against a
1.4-SNAPSHOT cluster)

Is it always reproducible for you? eg if you switch back to the earlier
client and try another set of runs, do you get the same results?

-Todd

On Mon, Apr 24, 2017 at 10:56 AM, Todd Lipcon <todd@cloudera.com> wrote:

> I vaguely recall some bug in earlier versions of the Java client where
> 'shutdown' wouldn't properly block on the data being flushed. So it's
> possible in 1.0.x and below, you're not actually measuring the full amount
> of time to write all the data, whereas when the bug is fixed, you are.
>
> I'll see if I can repro this locally as well using your code.
>
> -Todd
>
> On Mon, Apr 24, 2017 at 10:49 AM, David Alves <davidralves@gmail.com>
> wrote:
>
>> Hi Pavel
>>
>>   Interesting, Thanks for sharing those numbers.
>>   I assume you weren't using AUTOFLUSH_BACKGROUND for the first versions
>> you tested (don't think it was available then iirc).
>>   Could you try without in the last version and see how the numbers
>> compare?
>>   We'd be happy to help track down the reason for this perf regression.
>>
>> Best
>> David
>>
>> On Mon, Apr 24, 2017 at 4:58 AM, Pavel Martynov <mr.xkurt@gmail.com>
>> wrote:
>>
>>> Hi, I ran into the fact that I can not achieve high insertion speed and
>>> I start to experiment with https://github.com/cloude
>>> ra/kudu-examples/tree/master/java/insert-loadgen.
>>> My slightly modified code (recreation of table on startup + duration
>>> measuring): https://gist.github.com/xkrt/9405a2eeb98a56288b7
>>> c5a7d817097b4.
>>> On every run I change kudu-client version, results:
>>>
>>> kudu-client-ver  perf
>>> 0.10             Duration: 626 ms, 79872/sec
>>> 1.0.0            Duration: 622 ms, 80385 inserts/sec
>>> 1.0.1            Duration: 630 ms, 79365 inserts/sec
>>> 1.1.0            Duration: 11703 ms, 4272 inserts/sec
>>> 1.3.1            Duration: 12317 ms, 4059 inserts/sec
>>>
>>> As can you see there was a great degradation between 1.0.1 and 1.1.0
>>> (about a ~20 times!).
>>> What could be a problem, how can I fix it? (actually I interested in
>>> kudu-spark, so probably using of kudu-client 1.0.1 is not right solution?).
>>>
>>> My test cluster: 3 hosts with master and tserver on each (3 masters and
>>> 3 tservers overall).
>>> No extra settings, flags used:
>>> fs_wal_dir
>>> fs_data_dirs
>>> master_addresses
>>> tserver_master_addrs
>>>
>>>
>>> --
>>> with best regards, Pavel Martynov
>>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message