hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billy Pearson" <sa...@pearsonwholesale.com>
Subject Thrift Performance little odd
Date Mon, 02 Jun 2008 01:09:43 GMT
Maybe someone here can explain this to me

I am running a bulk import of large columns size average 15KB (web pages 
source) or so per record
I have one region server with only 1 region no splits yet
I have one other server running thrift server and the same server running 1 
thread import process

I am seeing at start about 60-80 records inserted per 3 secs reported by the 
GUI of the master
but once I hit my 64MB memcache limit on the region server it blocks and 
flushes the column.
Then immediately after that I see insert rate of about 600-700 per 3 sec 
said the gui of the master and this
last until I am done inserting only to slow down for more flushes 20-25 secs 
later and continues to speed along.

Any idea why it starts slow and jumps to such a higher rate of insert after 
the memcache flush?
Again this is all single threaded so no MR job or anything like I have ran 
this and seen it happen each time with the flushes
Happening at different times in the import and the same results happen so 
that rules out smaller data in the end half

So wondering if this is something related to the region server or the thrift 

hadoop 0.17.0, r652576
hbase 0.2.0-dev, r654653

Billy Pearson

View raw message