hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From h <hb...@patientcentral.com>
Subject We're seeing problems loading data into HBase using MR
Date Tue, 22 Mar 2011 17:59:38 GMT
Hey everyone,

I've got a situation where my data loads to HBase are failing.

The data is sent an isolated HBase cluster from a different hadoop cluster.  What I see is
that the performance is pretty bad (around 40k burst, 1k average inserts - with about 200
byte payloads).  If I were to write a standalone java client to hit the cluster I can get
a sustained 40k ops/sec insert. 80k ops/second if I run in a different window.

The network is all gigE. 4GB heap on region server.Nothing external of the HBase system running
on the cluster.

>From the MR side we see that the job eventually gets to 50% and then fails with no status
updates in 600 seconds.  If we were to write a simple java MR that shoves in about 10Gb data
through 20 reducers it also chokes and dies.

Is there anything that we should be looking at?  As a point of reference at 0.26 we could
push 250k ops / sec same jobs averaging in the 150's.  We also applied the META MEMSTORE_FLUSHSIZE
fix (http://hbase.apache.org/book/upgrading.html)

Any help is greatly appreciated!


View raw message