hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Nguyen <andrew-lists-hb...@ucsfcti.org>
Subject Re: HBase minimum block size for sequential access
Date Tue, 27 Jul 2010 20:26:15 GMT
Nevermind my last question.  I thought BLOCKSIZE was a table attribute but it is specific to
a column family.


On Jul 27, 2010, at 1:19 PM, Andrew Nguyen wrote:

> I just attempted to change the blocksize and it doesn't seem to be taking.  I am doing
the following in the shell:
> 
>> alter 'tablename', {METHOD=>'table_att', BLOCKSIZE=>1048576}
> 
> It returns without any errors but when I 'describe' the table, the setting is unchanged.
 Is there another place for me to make such changes to the table?
> 
> Thanks!
> 
> 
> On Jul 27, 2010, at 10:39 AM, Jean-Daniel Cryans wrote:
> 
>> I would try using the smallest keys you can (row key, family name,
>> qualifier) as they are all stored with each value and you want to
>> retrieve millions of them very quickly. Also do the usual like LZO
>> compressing the families, and reading as few families as possible at
>> the same time (eg if your table has 3 families, you shouldn't read
>> more than 1 at a time).
>> 
>> J-D
>> 
>> On Tue, Jul 27, 2010 at 10:30 AM, Andrew Nguyen
>> <andrew-lists-hbase@ucsfcti.org> wrote:
>>> Perfect thanks, I will run some experiments and keep you posted.
>>> 
>>> Aside from just getting elapsed time on scans of various sizes, are there any
other tips on what sorts of measurements to perform?  Also, since I'm doing the experiments
with various block sizes anyways, any requests for other types of benchmarks?
>>> 
>>> Thanks,
>>> Andrew
>>> 
>>> On Jul 27, 2010, at 10:13 AM, Jean-Daniel Cryans wrote:
>>> 
>>>>> Thanks for the heads up.  Do you know what happens if I set this value
larger than 5MB?  We will always be scanning the data, and always in large blocks.  I have
yet to calculate the typical size of a single scan but imagine that it will usually be larger
than 1MB.
>>>> 
>>>> I never tried that, hard to tell, but always eager to hear about
>>>> others' experiences :)
>>>> 
>>>>> 
>>>>> Also, is there any way to change the block size with data already in
HBase?  Our current import process is very slow (preprocessing of the data) and we don't have
the resources to store the preprocessed data.
>>>> 
>>>> After altering the table, issue a major compaction on it and
>>>> everything will be re-written with the new block size.
>>>> 
>>>> J-D
>>> 
>>> 
> 


Mime
View raw message