Don't use the thrift gateway for bulk import. Use the Java API, and be sure to turn off auto flushing and use a reasonably sizable commit buffer. 1-12MB is probably ideal. i can push a 20 node cluster past 180k inserts/sec using this. On Sat, Jun 6, 2009 at 5:51 PM, llpind wrote: > > Thanks Ryan, well done. > > I have no experience using Thrift gateway, could you please provide some > actual code here or in your blog post? I'd love to see how your method > compares with mine. > > Last night I was able to do ~58 million records in ~1.6 hours using the > HBase Java API directly. But with this new data, I'm seeing much slower > times. After reading around, it appears it's because my row key now > changes > often, whearas before it was constant for some time (more columns). Thanks > again. :) > > > Ryan Rawson wrote: > > > > Have a look at: > > > > > http://ryantwopointoh.blogspot.com/2009/01/performance-of-hbase-importing.html > > > > -ryan > > > > > > On Sat, Jun 6, 2009 at 4:55 PM, llpind wrote: > > > >> > >> I'm doing an insert operation using the java API. > >> > >> When inserting data where the rowkey changes often, it seems the inserts > >> go > >> really slow. > >> > >> Is there another method for doing inserts of this type? (instead of > >> BatchUpdate). > >> > >> Thanks > >> -- > >> View this message in context: > >> > http://www.nabble.com/Frequent-changing-rowkey---HBase-insert-tp23906724p23906724.html > >> Sent from the HBase User mailing list archive at Nabble.com. > >> > >> > > > > > > -- > View this message in context: > http://www.nabble.com/Frequent-changing-rowkey---HBase-insert-tp23906724p23906943.html > Sent from the HBase User mailing list archive at Nabble.com. > >