hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saad Mufti <saad.mu...@gmail.com>
Subject Re: Re:Got Duplicate Records for the Same Row Key from a Snapshot
Date Tue, 22 May 2018 17:58:57 GMT
I am not clear how your snapshot even succeeds if this is the case. The
snapshot taking procedure includes  a check for consistency at the end and
throws an exception on problems like this. I would run an hbck command on
your table to check if there are any consistency errors. It also has repair
options but you have to be careful with those. But running it in just
checking mode doesn't change anything and will give you useful feedback.

Hope this helps.

----
Saad


On Fri, May 18, 2018 at 3:56 AM, shanghaihyj <shanghaihyj@163.com> wrote:

> We find that the metadata of offline regions are included in the snapshot.
>
>
> When we query a table, offline regions are not considered.
> When we query a snapshot of this table, offline regions are included.
> These offline regions refer to the same data in HDFS.  That is why
> duplicate records are returned from the snapshot.
>
>
> Any suggestion how to handle this gracefully ?
>
>
>
> At 2018-05-17 19:04:17, "shanghaihyj" <shanghaihyj@163.com> wrote:
> >We are loading data from the HBase table or its snapshot by hbase-rdd (
> https://github.com/unicredit/hbase-rdd). It uses TableInputFormat /
> TableSnapshotInputFormat as the underlying input format.
> >The scaner has max version set to 1.
> >
> >
> >
> >At 2018-05-17 15:35:08, "shanghaihyj" <shanghaihyj@163.com> wrote:
> >
> >When we query a table by a particular row key, there is only one row
> returned by HBase, which is expected.
> >However, when we query a snapshot for that same table, by the same
> particular row key, five duplicate rows are returned.  Why ?
> >
> >
> >
> >
> >In the log of the master server, we see some snapshot-related error:
> >===================== ERROR START =====================
> >ERROR [master:sh-bs-3-b8-namenode-17-208:60000.archivedHFileCleaner]
> snapshot.SnapshotHFileCleaner: Exception while checking if files were
> valid, keeping them just in case.
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7:org.
> apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read
> snapshot info from:hdfs://master1.hh:8020/hbase/.hbase-snapshot/.tmp/hb_
> anchor_original_total_7days_stat_1526423587063/.snapshotinfo
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.
> readSnapshotInfo(SnapshotDescriptionUtils.java:325)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.getHFileNames(
> SnapshotReferenceUtil.java:328)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner$1.
> filesUnderSnapshot(SnapshotHFileCleaner.java:85)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.
> getSnapshotsInProgress(SnapshotFileCache.java:303)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.
> getUnreferencedFiles(SnapshotFileCache.java:194)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner.
> getDeletableFiles(SnapshotHFileCleaner.java:62)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteFiles(
> CleanerChore.java:233)
> >./hbase-root-master-sh-bs-3-b8-namenode-17-208.log.7-   at
> org.apache.hadoop.hbase.master.cleaner.CleanerChore.checkAndDeleteEntries(
> CleanerChore.java:157)
> >...
> >===================== ERROR END =====================
> >And we find a related issue for this error: https://issues.apache.org/
> jira/browse/HBASE-16464?attachmentSortBy=fileName
> >
> >
> >However, there is no proof that the error in the log is related to our
> problem of having duplicate records from a snapshot.
> >Our HBase version is 0.98.18-hadoop2.
> >
> >
> >Could you help give some hint why we are having duplicate records from
> the snapshot ?
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message