hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Wheeler <matt.whee...@explorysmedical.com>
Subject createTable with specified region splits: works great
Date Tue, 15 Feb 2011 18:52:40 GMT
Pre-creating regions using the byte[][] overload of createTable more or less doubled the performance
of our main index table generation.  Our keys start with hashes of the original record IDs,
so the data can be evenly distributed between all regions.  The keys are ASCII strings starting
with the hash value in hexadecimal, so we specify split keys as zero-padded ASCII strings
with equal length.

We try to select an initial region count that will avoid any region splits during the index
MR job, without making the table larger than it needs to be.  Performance suffered when we
created the table with about 3 times more regions than necessary.

- matt

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message