hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rahul gidwani <rahul.gidw...@gmail.com>
Subject Re: Can't improve low random-read speed
Date Fri, 16 Mar 2018 17:39:22 GMT
do you have bloom filters turned on?  Are you hitting the block cache or
bucket cache?

On Fri, Mar 16, 2018 at 8:36 AM, Sergey Sova <sergey.sova42@gmail.com>
wrote:

> Hi. I'm investigating an issue with low random read speed for a few days,
> still stuck. I've already read a few mailing list threads, didn't help,
> though setup is a bit different.
>
> HBase setup:
> 3 nodes, 8GB Heap for Region servers, 1 master
> Table with pre-splitted 120 regions (partitioning), ~ 120M rows.
> Each row has 2CF, one contains data varying 50KB - 4MB, one is smaller,
> maybe around 10KB, rows can differ significantly in size, both have GZ
> compression type, block size 1.3MB and 32KB respectively.
> HDD disks with raid 1, 3TB SATA  6Gb/sec 7200 RPM enterprise
> Network speed between nodes is 1Gbit/sec
>
> Access pattern: get 50 random rows at once.
>
> I run tests in 10 threads from a single application, testing both column
> families, doing in total ~ 200 requests, ~20 per thread.
> Bigger CF takes more time to load, don't see other differences.
>
> Results for big CF:
>
> Loaded data in 102 sec
> Received 341065 KB  (341 MB)
> Bandwidth: 3340 KB/Sec
> Avg time: 7234 ms
>
> If I perform scan (caching=100) over this table from local machine, I get
> these results
> Read 10000 items total
> Finished in 94162 ms
> KBytes read: 199451
> Avg speed: 2117 KB/s
> Not sure if they can help, because it's a different load pattern, but maybe
> it clarifies anything.
> For me, it looks strange that speed is as low as random reads.
>
> What are my thoughts:
> 1. access pattern is not good for HDD - 50 random reads at once, in 10
> threads, but I can't say for sure how bad is it
> 2. row size is rather big. I've read several posts about same problems, all
> had small rows. General advice was to play with block size so HBase
> wouldn't read too much data from disk, I think it's not my case.
>
> Other observations: RegionServer gc logs look OK, RegionServer CPU
> profiling does not show anything strange.
>
> So, can someone give some hints or directions?
> Thanks in advance.
>
> Sergey
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message