hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Udo Offermann <udo.offerm...@zfabrik.de>
Subject incomplete hbase exports due to scan timeouts
Date Mon, 26 Aug 2019 09:04:17 GMT
Hi everybody, 

We are running 6 data nodes (plus one master node - version HBase 1.0.0-cdh5.6.0) in each
case on a productive and a test environment. Each month we export the deltas of the previous
month from the productive system (using org.apache.hadoop.hbase.mapreduce.Export) and import
them into the test system. 
From time to time we are using RowCounter and an analytics map-reduce job written by our own
to check if the restore is fine. 

Now we see that the Export/Import is broken since April 2019. After lots of investigations
and tests we found that the bug described in https://github.com/hortonworks-spark/shc/issues/174
<https://github.com/hortonworks-spark/shc/issues/174> causes the problems. 

After increasing the timeouts (client and roc timeout) from 1 minute to 10 minutes the row
counts in the test system seem to be in a good shape (we counted the rows for one month via
RowCounter and scan on the hbase shell).

Now we are about to implement the changes in the productive system.

But the question remains what causes the long timeouts. Some of the tests we did revealed
ScannerTimeouts after 60 seconds (the default setting). But 60 seconds - for an android, that
is nearly an eternity. Thus we assume that there is something wrong, but how can we find out.
The hbase locality factor is 1.0 or close to 1.0 for most of the regions. 

My questions are: Is it possible that „silent timeouts“ can cause incomplete exports?

Is it usual that scans take longer than 1 minute - even if it seems that up to April the exports
were all ok?
How can one identify regions which are in trouble?

Thank you and best regards

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message