hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Meil <doug.m...@explorysmedical.com>
Subject Re: HBase Region move() and Data Locality
Date Mon, 05 Mar 2012 16:40:33 GMT

This doesn't address your question on move(), but regarding locality, see
8.7.3 in here...


.. it's not just major compactions, but any write of a storefile that
affects locality (flush, minor, major).

On 3/5/12 11:02 AM, "Bryan Beaudreault" <bbeaudreault@hubspot.com> wrote:

>Hey all,
>We are running on cdh3u2 (soon to upgrade to 3u3), and we notice that
>regions are balanced solely based on the number of regions per region
>server, with no regard for horizontal scaling of tables.  This was mostly
>fine with a small number of regions, but as our cluster reaches thousands
>of regions we are often finding an entire table (or large part of one) on
>single region server.  This seems suboptimal.
>We were looking into options for this, and noticed that it is fixed in
>(possibly 0.92?), but we are wanting to stick with CDH for now.  With that
>mind, we needed alternatives, and found the HBaseAdmin move(byte[],
>byte[])>.  The documentation doesn't mention, but I'm wondering if using
>this function ruins locality.  Without the locality problem, I was
>of creating a utility that allowed us to scramble the regions of a table
>then called balance(), which would hopefully result in a better spread of
>regions for a table.  However, I don't want to ruin our performance by
>ruining the locality.
>The HBase book mentions that locality is achieved through major
>compactions.  If I have the opportunity to take some downtime, would it be
>feasible to scramble all of the regions, run balance() to make sure all
>regionservers have about the same number, then a major compaction to fix

View raw message