hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ondřej Stašek <ondrej.sta...@firma.seznam.cz>
Subject Problems with scan after lot of Puts
Date Tue, 29 May 2012 13:13:03 GMT
My program writes changes to HBase table by issuing lots of Puts 
(autoCommit turned off, flush on end) and afterwards uses ResultScanner 
on whole table to read all rows and act upon them. My problem is that on 
several occasions scan does not return expected rows. Either scan does 
not start on the beginning of table or somewhere during scan I got old 
data (not those written by Puts before).

I have even written simple test application to simulate this behavior:
1. write 1M simple numbered rows to a table
2. scan through table to test output, delete every 10th row
3. scan again after delete
4. repeat until error found

Sample output:

12/05/29 00:32:12 INFO hbase.TestPutScan: Run 342 put 1000000 rows
12/05/29 00:32:35 INFO hbase.TestPutScan: Run 342 scan + del every 10th row
12/05/29 00:33:29 INFO hbase.TestPutScan: Run 342 scan
12/05/29 00:33:29 ERROR hbase.TestPutScan: Expected value: value 0000001 
0000342, got: value 0281999 0000342

This means, that program expected to get first row, but got 281999th.

This test ran on "minicluster" of 2 regionservers runing Cloudera's 
cdh3u4 distribution.

Today I got 3 errors like that and from RS's log it seems that in the 
same time hbase balancer issued reassign command for this table region 
(table have only 1 region).

Any pointers on what to check or what to send you to help resolve this 


Ondrej Stasek

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message