hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Esteban Gutierrez <este...@cloudera.com>
Subject Re: how to optimize for heavy writes scenario
Date Fri, 17 Mar 2017 17:11:10 GMT

Its a little bit hard to tell, assuming that you have tuned the number of
regions and already looked into common perf issues like networking or any
other issue with HDFS, you should probably consider to try HBase 1.2 (see
HBASE-15146) and distros with other fixes like HBASE-17072 and HBASE-16146
to start with. Under some workloads that should give you another 10-20% of
perf gain or even more. Depending on the number client connections per RS
you might also want to tune GC.

hope that helps,

Cloudera, Inc.

On Fri, Mar 17, 2017 at 9:31 AM, Hef <hef.online@gmail.com> wrote:

> Hi group,
> I'm using HBase to store large amount of time series data, the usage case
> is heavy on writes then reads. My application stops at writing 600k
> requests per second and I can't tune up for better tps.
> Hardware:
> I have 6 Region Servers, each has 128G memory, 12 HDDs, 2cores with
> 24threads,
> Schema:
> The schema for these time series data is similar as OpenTSDB that the data
> points of a same metric within an hour are store in one row, and there
> could be maximum 3600 columns per row.
> The cell is about 70bytes on its size, including the rowkey, column
> qualifier, column family and value.
> HBase config:
> CDH 5.6 HBase 1.0.0
> 100G memory for each RegionServer
> hbase.hstore.compactionThreshold = 50
> hbase.hstore.blockingStoreFiles = 100
> hbase.hregion.majorcompaction disable
> hbase.client.write.buffer = 20MB
> hbase.regionserver.handler.count = 100
> hbase.hregion.memstore.flush.size = 128MB
> HBase Client:
> write in BufferedMutator with 100000/batch
> Inputs Volumes:
> The input data throughput is more than 2millions/sec from Kafka
> My writer applications are distributed, how ever I scaled them up, the
> total write throughput won't get larger than 600K/sec.
> The severs have 20% CPU usage and 5.6 wa,
> GC  doesn't look good though, it shows a lot 10s+.
> In my opinion,  1M/s input data will result in only  70MByte/s write
> throughput to the cluster, which is quite a small amount compare to the 6
> region servers. The performance should not be bad like this.
> Is anybody has idea why the performance stops at 600K/s?
> Is there anything I have to tune to increase the HBase write throughput?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message