hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jg...@facebook.com>
Subject RE: how can I check the I/O influence HBase to HDFS
Date Tue, 06 Apr 2010 22:15:06 GMT
Can you explain more about what information you are trying to find out?

You had an existing HDFS and you want to measure the additional impact adding HBase is?  Is
that in terms of reads/writes/iops or data size?

If you have a steady-state set of metrics for HDFS w/o HBase, can you not just monitor those
metrics w/ HBase running and calculate the deltas?

Also, to what end are you trying to figure this out?  I'm very much interested in what courses
of actions you might take given the different information you could find out about HBase's
influence on your cluster.


> -----Original Message-----
> From: steven zhuang [mailto:steven.zhuang.1984@gmail.com]
> Sent: Tuesday, April 06, 2010 8:34 AM
> To: hbase-user@hadoop.apache.org
> Subject: how can I check the I/O influence HBase to HDFS
> hi, there,
>               I have this problem of checking the influence HBase
> brought to
>               I have a Hadoop cluster which has 30+ data nodes, and a
> Hbase
> cluster based on it, with 18 regionservers residing on 18 datanodes.
>               we have observed the HDFS IO has increased a lot if we do
> some
> importing or query ops on hbase tables, but we don't know how
> much would hbase impact the HDFS, so now I have to dig into this.
>               my idea is as follows:
>                  1.  grep from regionservers logs the file information
> of
> hbase tables, which mainly should be store files' names and their
> sizes, sum
> the size up.
>                  2. grep from datanodes' logs the HDFS_READ/HDFS_WRITE
> log,
> and calculate the whole IO bytes.
>                  3. get the rate of HBase IO / HDFS IO.
>                my concern is that if the above idea is right, is there
> anything missing or a better way to do this?
>                And to make it more convinsible, I want to have the
> block
> info for each HTable's, not just those ones under each table's
> directory,
> but also those store files which was later removed by major compaction,
> since in datanode log, all I can see is block id, any pointer or hint
> is
> really appreciated.

View raw message