hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Bangzhong <bangzh...@gmail.com>
Subject Re: hbase performance
Date Fri, 02 Apr 2010 09:04:09 GMT
在 2010年4月2日 下午4:58,Juhani Connolly <juhani@ninja.co.jp>写道:

> You're results seem very low, but your system specs are also quite
> moderate.
>
> On 04/02/2010 04:46 PM, Chen Bangzhong wrote:
> > Hi, All
> >
> > I am benchmarking hbase. My HDFS clusters includes 4 servers (Dell 860,
> with
> > 2 GB RAM). One NameNode, one JobTracker, 2 DataNodes.
> >
> > My HBase Cluster also comprise 4 servers too. One Master, 2 region and
> one
> > ZooKeeper. (Dell 860, with 2 GB RAM)
> >
> While I'm far from being an authority on the matter, running
> datanodes+regionservers together should help performance
> Try making your 2 datanodes + 2 regionservers into 4 servers running
> both data/region.
>

I will try to run datanode and region server on the same server.


> > I runned the org.apache.hadoop.PerformanceEvaluation on the ZooKeeper
> > server. the ROW_LENGTH was changed from 1000 to ROW_LENGTH = 100*1024;
> > So each value will be 100k in size.
> >
> > hadoop version is 0.20.2, hbase version is 0.20.3. dfs.replication set to
> 1.
> >
> Setting replication to 1 isn't going to give results that are very
> indicative of a "real" application, making it questionable as a
> benchmark. If you intend to run on a single replica at release, you'll
> be at high risk of data loss.
>

Since I have only 2 data nodes, I set replication to 1. In production, it
will be set to 3.


> > The following is the command line:
> >
> > bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred
> > --rows=10000 randomWrite 20.
> >
> > It tooks about one hour to complete the test(3468628 ms), about 60 writes
> > per second. It seems the performance is disappointing.
> >
> > Is there anything I can do to make hbase perform better under 100k size
> ?I
> > didn't try the method mentioned in the performance wiki yet, because I
> > thought 60writes/sec is too low.
> >
> >
> Do you mean *over* 100k size?
> 2GB ram is pretty low and you'd likely get significantly better
> performance with it, though on this scale it probably isn't a
> significant problem.
>

the data size is exactly 100k size.


> > If the value size is 1k, hbase performs much better. 200000 sequencewrite
> > tooks about 16 seconds, about 12500 writes/per second.
> >
> >
> Comparing sequencewrite performance with randomwrite isn't a helpful
> indicator. Do you have randomWrite results for 1k values? The way your
> performance degrades with the size of the records seems like you may
> have a bottleneck at network transfer? What's rack locality like and how
> much bandwidth do you have between the servers?
> > Now I am trying to benchmark using two clients on 2 servers, no result
> yet.
> >
> >
>

for 1k datasize, the sequencewrite performance and randomWrite performance
is about the same. All my servers are under one switch, don't know the
switch bandwidth yet.


> You're already running 20 clients on your first server with the
> PerformanceEvaluation. Do you mean you intend to run 20 on each?
>

In fact, it is 20 threads on one machine.

>
> Hopefully someone with better knowledge can give a better answer but my
> guess is that you have a network transfer transfer. Try doing further
> tests with randomWrite and decreasing value sizes and see if the time
> correlates to the total amount of data written.
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message