From hbase-user-return-4491-apmail-hadoop-hbase-user-archive=hadoop.apache.org@hadoop.apache.org Sun Jun 07 00:59:24 2009 Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 96849 invoked from network); 7 Jun 2009 00:59:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Jun 2009 00:59:24 -0000 Received: (qmail 49355 invoked by uid 500); 7 Jun 2009 00:59:36 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 49295 invoked by uid 500); 7 Jun 2009 00:59:35 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 49285 invoked by uid 99); 7 Jun 2009 00:59:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 07 Jun 2009 00:59:35 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ryanobjc@gmail.com designates 74.125.46.31 as permitted sender) Received: from [74.125.46.31] (HELO yw-out-2324.google.com) (74.125.46.31) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 07 Jun 2009 00:59:25 +0000 Received: by yw-out-2324.google.com with SMTP id 9so1288221ywe.29 for ; Sat, 06 Jun 2009 17:59:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=yD+LwZ15Gje625YYxLWr2B4sR2mtoIQusd0CMc982pk=; b=kIsE58bUZwMXhsvYgBwwwfQ8ueVNchr2M161XsZF4QgjpLKc0uCCoAG0t/QSCAzuA3 HyPQk5yLQEqs/SH6uNUY17sfJtFhyABeCm4RFtOlCG44H0tA7iGvj94g+LrQzZboONPq Y4qDD6vY1OdGrE+B94zUImeiWZnoVo60GPH0E= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=taKaiFxJXX789rCMIfTfHC89Zth4GA2C61m1X2XmNAyirkVC2cWXieXKVPiGDF3W/u Sj0ACqQy4gmhvgSiXyaa3iHiq/+w+OE3DU11ywvu44KzahtDmbOk9mtCXzglVC+LCceX ZTLtFJ14teX/PeX2KkBiHVXk5LfbhJb0Kn4JA= MIME-Version: 1.0 Received: by 10.151.122.4 with SMTP id z4mr9371054ybm.196.1244336344183; Sat, 06 Jun 2009 17:59:04 -0700 (PDT) In-Reply-To: <23906943.post@talk.nabble.com> References: <23906724.post@talk.nabble.com> <78568af10906061727r6050b505i1b314d50d7238de9@mail.gmail.com> <23906943.post@talk.nabble.com> Date: Sat, 6 Jun 2009 17:59:04 -0700 Message-ID: <78568af10906061759l505a4526x7aebc7cf99651f5a@mail.gmail.com> Subject: Re: Frequent changing rowkey - HBase insert From: Ryan Rawson To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001e680f13ecc6b5b4046bb7a103 X-Virus-Checked: Checked by ClamAV on apache.org --001e680f13ecc6b5b4046bb7a103 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Don't use the thrift gateway for bulk import. Use the Java API, and be sure to turn off auto flushing and use a reasonably sizable commit buffer. 1-12MB is probably ideal. i can push a 20 node cluster past 180k inserts/sec using this. On Sat, Jun 6, 2009 at 5:51 PM, llpind wrote: > > Thanks Ryan, well done. > > I have no experience using Thrift gateway, could you please provide some > actual code here or in your blog post? I'd love to see how your method > compares with mine. > > Last night I was able to do ~58 million records in ~1.6 hours using the > HBase Java API directly. But with this new data, I'm seeing much slower > times. After reading around, it appears it's because my row key now > changes > often, whearas before it was constant for some time (more columns). Thanks > again. :) > > > Ryan Rawson wrote: > > > > Have a look at: > > > > > http://ryantwopointoh.blogspot.com/2009/01/performance-of-hbase-importing.html > > > > -ryan > > > > > > On Sat, Jun 6, 2009 at 4:55 PM, llpind wrote: > > > >> > >> I'm doing an insert operation using the java API. > >> > >> When inserting data where the rowkey changes often, it seems the inserts > >> go > >> really slow. > >> > >> Is there another method for doing inserts of this type? (instead of > >> BatchUpdate). > >> > >> Thanks > >> -- > >> View this message in context: > >> > http://www.nabble.com/Frequent-changing-rowkey---HBase-insert-tp23906724p23906724.html > >> Sent from the HBase User mailing list archive at Nabble.com. > >> > >> > > > > > > -- > View this message in context: > http://www.nabble.com/Frequent-changing-rowkey---HBase-insert-tp23906724p23906943.html > Sent from the HBase User mailing list archive at Nabble.com. > > --001e680f13ecc6b5b4046bb7a103--