hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henning Blohm <henning.bl...@zfabrik.de>
Subject parallel scanning?
Date Mon, 25 Jan 2016 18:29:32 GMT

I am looking for advise on an HBase mass data access optimization problem.

In our application all data records stored in Hbase have a time 
dimension (as inverted time) and a GUID in the row key. Retrieving a 
record requires issueing a scan with the GUID as prefix.

In order to get to entry (there is various access paths) we use a simple 
secondary index that also has a time dimension in the row and so needs a 
scan as well.

For mass updates I am currently seeking ways to improve lookup performance.

I found various discussions and issues on multi-scans (as in multi-Get, 
multi-Delete) but none of it was really helpful in sorting out the most 
promising direction.

Currently I am experimenting with simply parallelizing lookups in chunks 
from the client. That reduces eplapsed wait time a bit. It seems though 
that avoiding roundtrips altogether by "scanning in parallel 
server-side" should show much better improvements.

Is there anything like that already available that I should look into?


Henning Blohm

*ZFabrik Software GmbH & Co. KG*

T: 	+49 6227 3984255
F: 	+49 6227 3984254
M: 	+49 1781891820

Lammstrasse 2 69190 Walldorf

henning.blohm@zfabrik.de <mailto:henning.blohm@zfabrik.de>
Linkedin <http://www.linkedin.com/pub/henning-blohm/0/7b5/628>
ZFabrik <http://www.zfabrik.de>
Blog <http://www.z2-environment.net/blog>
Z2-Environment <http://www.z2-environment.eu>
Z2 Wiki <http://redmine.z2-environment.net>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message