Hi!
I am using Hbase 0.94.1 version over a distributed cluster of 20 nodes.
When i execute hbase count over a table in a shell, i got the count of
2152416 rows.
When i did the same thing using the rowcounter mapreduce, i got the value
as below
org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counters
13/02/10 00:05:06 INFO mapred.JobClient: ROWS=1389991
Same thing happened when i used pig to count or do operations. There is
inconsistency between both the results.
During the mapreduce, i have noticed that there are 5 tasks that are
killed. When i tried to trace back to the tasktracker logs of the node it
shows similar to below log.
2013-02-09_23:58:58.40665 13/02/09 23:58:58 INFO mapred.TaskTracker: JVM
with ID: jvm_201302090035_0015_m_1905604998 given task:
attempt_201302090035_0015_m_000012_1
2013-02-09_23:59:03.57016 13/02/09 23:59:03 INFO mapred.TaskTracker:
Received KillTaskAction for task: attempt_201302090035_0015_m_000012_1
2013-02-09_23:59:03.57034 13/02/09 23:59:03 INFO mapred.TaskTracker: About
to purge task: attempt_201302090035_0015_m_000012_1
2013-02-09_23:59:03.61003 13/02/09 23:59:03 INFO util.ProcessTree: Killing
process group9745 with signal TERM. Exit code 0
I have also tried to run the tool 'hbck' but it shows no inconsistencies.
Can you please suggest me why there is inconsistency and how can i correct
it ?
Thanks,
--
Kiran Chitturi
|