hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Buttler, David" <buttl...@llnl.gov>
Subject RE: Inserting Random Data into HBASE
Date Thu, 02 Dec 2010 00:20:32 GMT
Any reason you are not doing this in a m/r job?  You could split up your key space into sections
and have each mapper only populate ids in a section to eliminate the possibility of overwriting
and not getting exactly the number you want

Import speed depends on how parallel you can make both the client(s) and hbase.  And this
depends on the number of machines and the dfs replication factor

-----Original Message-----
From: rajgopalv [mailto:raja.fire@gmail.com] 
Sent: Wednesday, December 01, 2010 6:48 AM
To: hbase-user@hadoop.apache.org
Subject: Inserting Random Data into HBASE

I have to test hbase as to how long it takes to store 100 Million Records.

So i wrote a simple java code which 

1 : generates random key and 10 columns per key and random values for the 10
2 : I make a Put object out of these and store it in arrayList
3 : When arrayList's size reaches 5000 i do table.put(listOfPuts);
4 : repeat until i put 100 million records.

And i run this java program as single threaded java program. 

Am i doing it right? is there any other way of importing large data for
testing.? [ for now i'm not considering BULK data import/loadtable.rb etc. 
apart from this is there any other way ?] 

View this message in context: http://old.nabble.com/Inserting-Random-Data-into-HBASE-tp30349594p30349594.html
Sent from the HBase User mailing list archive at Nabble.com.

View raw message