hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juhani Connolly <juh...@ninja.co.jp>
Subject Re: hbase performance
Date Fri, 02 Apr 2010 08:58:38 GMT
You're results seem very low, but your system specs are also quite

On 04/02/2010 04:46 PM, Chen Bangzhong wrote:
> Hi, All
> I am benchmarking hbase. My HDFS clusters includes 4 servers (Dell 860, with
> 2 GB RAM). One NameNode, one JobTracker, 2 DataNodes.
> My HBase Cluster also comprise 4 servers too. One Master, 2 region and one
> ZooKeeper. (Dell 860, with 2 GB RAM)
While I'm far from being an authority on the matter, running
datanodes+regionservers together should help performance
Try making your 2 datanodes + 2 regionservers into 4 servers running
both data/region.
> I runned the org.apache.hadoop.PerformanceEvaluation on the ZooKeeper
> server. the ROW_LENGTH was changed from 1000 to ROW_LENGTH = 100*1024;
> So each value will be 100k in size.
> hadoop version is 0.20.2, hbase version is 0.20.3. dfs.replication set to 1.
Setting replication to 1 isn't going to give results that are very
indicative of a "real" application, making it questionable as a
benchmark. If you intend to run on a single replica at release, you'll
be at high risk of data loss.
> The following is the command line:
> bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred
> --rows=10000 randomWrite 20.
> It tooks about one hour to complete the test(3468628 ms), about 60 writes
> per second. It seems the performance is disappointing.
> Is there anything I can do to make hbase perform better under 100k size ´╝čI
> didn't try the method mentioned in the performance wiki yet, because I
> thought 60writes/sec is too low.
Do you mean *over* 100k size?
2GB ram is pretty low and you'd likely get significantly better
performance with it, though on this scale it probably isn't a
significant problem.
> If the value size is 1k, hbase performs much better. 200000 sequencewrite
> tooks about 16 seconds, about 12500 writes/per second.
Comparing sequencewrite performance with randomwrite isn't a helpful
indicator. Do you have randomWrite results for 1k values? The way your
performance degrades with the size of the records seems like you may
have a bottleneck at network transfer? What's rack locality like and how
much bandwidth do you have between the servers?
> Now I am trying to benchmark using two clients on 2 servers, no result yet.
You're already running 20 clients on your first server with the
PerformanceEvaluation. Do you mean you intend to run 20 on each?

Hopefully someone with better knowledge can give a better answer but my
guess is that you have a network transfer transfer. Try doing further
tests with randomWrite and decreasing value sizes and see if the time
correlates to the total amount of data written.

View raw message