hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Han Liu <ha...@andrew.cmu.edu>
Subject Re: Server-side write buffer configuration
Date Mon, 09 Aug 2010 20:07:32 GMT
I see. 0.89 is the still a developer release and I hear that it is not stable. But it sounds
really tempting because it boosts performance by a lot. Can I trust it if my final goal is
to insert about 100TB of data? What could be the possible issues? Also when shall I expect
to see a stable release? 

So in order to do multiple-client insertion I basically just need to create multiple HTable
objects to handle the insertions? Do I need to do multi-threading manually? 

And finally a possibly stupid question: how do I check what is the total number of regions?


Thanks. :)


On Aug 9, 2010, at 3:51 PM, Jean-Daniel Cryans wrote:

> HBASE-2066 was committed and it will be automatically in function when
> using the write buffer starting with version 0.89, eg this contains it
> http://hbase.apache.org/docs/r0.89.20100621/
> 
> Using more than 1 clients is basically starting more of them, the same
> way you started the first one. Your input data can then be split
> between the clients, using either MapReduce or your homegrown solution
> (we imported the stumbles with a MR job).
> 
> J-D
> 
> On Mon, Aug 9, 2010 at 12:44 PM, Han Liu <hanl1@andrew.cmu.edu> wrote:
>> Thanks for your reply J-D!
>> 
>> Could you explain more about the HBase-2066 schema? For example how did you do the
first 3 steps described on that page?
>> 
>> Also is there any documentation that describes the multiple-client in HBase?
>> 
>> 
>> On Aug 9, 2010, at 2:37 PM, Jean-Daniel Cryans wrote:
>> 
>>> That's pretty powerful machines, I would expect more performance. You
>>> could try using the same settings that we do here, checkout ryan's
>>> presentation, page 16:
>>> http://people.apache.org/~jdcryans/HUG8/HUG8-rawson.pdf
>>> 
>>> Google "IO wait" to learn about it.
>>> 
>>> Multi-clients will be faster unless you are already maxing out the
>>> machines (betting 100$ you're not), it's like asking if doing parallel
>>> processing will be faster than sequential processing.
>>> 
>>> J-D
>>> 
>>> On Mon, Aug 9, 2010 at 11:21 AM, Han Liu <hanl1@andrew.cmu.edu> wrote:
>>>> Thanks for the reply J-D.
>>>> In-lined.  :)
>>>> 
>>>> 
>>>> On Aug 9, 2010, at 1:57 PM, Jean-Daniel Cryans wrote:
>>>> 
>>>>> Hard to tell if it's decent performance. How do you define "decent"?
>>>> I consider it descent if it is roughly the best performance one can get using
my schema on my machines
>>>>> What kind of hardware are we talking about?
>>>> One machine for HBase master and 6 regionservers. Specs of each of these
machines:
>>>> 16 GB Ram
>>>> 4 1TB 7200RPM SATA Drives
>>>> 10 Gb Network: 1x Qlogic QLE3142-Cu-CK
>>>> CPU: 2x quad-core E5440 (2.83GHz, 12MB L2 cache, 1333 MHz FSB)
>>>>> Which version are you
>>>>> using?
>>>> 0.20.4
>>>>> How much memory was given to HBase?
>>>> 6 GB
>>>>> 
>>>>> Also did you set the write buffer on the client side on HTable?
>>>> Yes I set it to be "1024*1024*12" bytes
>>>>> Did
>>>>> you also turn off auto-flushing?
>>>> Yes it's turned off
>>>>> Do you monitor your cluster? If so,
>>>>> do you see lots if IO wait?
>>>> I didn't.. What do IO waits indicate?
>>>>> 
>>>>> And finally, do you use a single client or multiple ones?
>>>>> 
>>>> Single client. Will multiple client boost performance?
>>>>> :)
>>>>> 
>>>> :) :)
>>>> 
>>>> Thanks a lot.
>>>>> J-D
>>>>> 
>>>>> On Mon, Aug 9, 2010 at 10:46 AM, Han Liu <hanl1@andrew.cmu.edu>
wrote:
>>>>>> Hi Guys,
>>>>>> 
>>>>>> I know on the client side of HBase there's a configuration" hbase.client.write.buffer".
I wonder if there's a similar configuration on the region server side that i can tweak to
adjust performance?
>>>>>> 
>>>>>> Also as of right now I have managed to insert 15gb data to a 6-regionserver
HBase database in roughly 26 minutes using the "table.put(Put p)" schema. Generally is this
a decent performance?
>>>>>> 
>>>>>> Any advice would be appreciated. Thanks a lot in advance.
>>>>>> --
>>>>>> Han Liu
>>>>>> SCS & HCI Institute
>>>>>> Undergrad. Class of 2012
>>>>>> Carnegie Mellon University
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> --
>>>> Han Liu
>>>> SCS & HCI Institute
>>>> Undergrad. Class of 2012
>>>> Carnegie Mellon University
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>> 
>> --
>> Han Liu
>> SCS & HCI Institute
>> Undergrad. Class of 2012
>> Carnegie Mellon University
>> 
>> 
>> 
>> 
>> 
> 

--
Han Liu
SCS & HCI Institute
Undergrad. Class of 2012 
Carnegie Mellon University





Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message