hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billy Pearson" <sa...@pearsonwholesale.com>
Subject Re: Region server memory requirements
Date Mon, 22 Dec 2008 19:40:32 GMT
hey guys there is a var in hadoop that can help with out having to change 
the index int its io.map.index.skip
this can be changed to lower memory usage without having to wait until the 
map files are compacted again and you can change as needed.

"stack" <stack@duboce.net> wrote in message 
news:494C6E27.7030105@duboce.net...
> Andrew Purtell wrote:
>> Based on this, leaving the Hadoop default (128) might be the
>> way to go.
>
> Sounds good.  I made HBASE-1070 to do above for TRUNK and branch.
>> Later, maybe it would make sense to dynamically set the index
>> interval based on the distribution of cell sizes in the mapfile at some 
>> future time, according to some parameterized
>> formula that could be adjusted with config variable(s). This
>> could be done during compaction. Would make sense to also
>> consider the distribution of key lengths. Or there could be
>> other similar tricks implemented to keep index sizes down.
>
> Made HBASE-1071.  We should be able to do it at flush time, not just 
> compacting, since we have count of keys and could keep running tally on 
> memcache insert of notable attributes of key so we had these to plugin to 
> the formula at flush time.
>
>> In my opinion, for 0.20.0, MapFile should be brought local so
>> we can begin hacking it over time into a new file format. I
>> was thinking that designing and/or implementing a wholly new
>> format such as TFile would block I/O improvements for a long
>> time.
>>
> I don't know.  This would be the safer tack for sure, but lets at least 
> keep open the possibility of our moving to a completely new file format in 
> 0.20.0 timeframe.
>
> St.Ack
> 



Mime
View raw message