hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Robertson <timrobertson...@gmail.com>
Subject PerformanceEvaluation scan - how to read the results?
Date Wed, 25 Jan 2012 14:21:01 GMT
Hi all,

I am trying to sanitize our setup, and using the PerformanceEvaluation
as a basis to check.

To to this, I ran the following to load it up:
$HADOOP_HOME/bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation
randomWrite 5
This gave me 32 regions across 2 of our 3 region servers (we have HDFS
across 17 nodes but only machines running 3 RS).

And then the following to scan:
$HADOOP_HOME/bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation scan 5

The output of the scan is:
12/01/25 15:11:02 INFO mapred.JobClient:     ROWS=5242850
12/01/25 15:11:02 INFO mapred.JobClient:     ELAPSED_TIME=1624832
(job took 52 secs in reality)

Can anyone elaborate on how I am meant to interpret these numbers
please?  Looks like 3.2 rows per <timeunit>

[I am trying to benchmark because our real data of 340M rows (215G on
HDFS) takes 60 mins to scan which seems a lot]

Thanks for any pointers you might provide to help benchmark scanning,

View raw message