hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jungdae Kim <kjd9...@gmail.com>
Subject Re: Replication current progress at position -1
Date Fri, 08 Nov 2019 03:01:11 GMT
Hello, Alexander

HBase 1.4.x have some issues related to updating the position of WALs being
replicated.
one of the issues is about stacking old WALs, when a region server has no
regions of the table being replicated, or no mutations come in for a while.

I'm not sure you have the same issue, with the your logs.
If you are suffering the same issue, you can find many old WALs in HDFS
oldWals directory({hbase.rootdir}/oldWALs), and in zookeeper replication
queues ({znodeParent/replication/rs/{rs}/{peer}/}, and also detour the
issue by assigning a region of tables being replicated to the region server.

The  issue has already reported (
https://issues.apache.org/jira/browse/HBASE-22784), and resolved.
But, unfortunately, the patch spawned the other issues such as region
server aborting (https://issues.apache.org/jira/browse/HBASE-23169)

I'm working on these issues in
https://issues.apache.org/jira/browse/HBASE-23205 (not merged yet)

I hope this will be helpful to you.

On Thu, Nov 7, 2019 at 12:30 AM Alexander Batyrshin <0x62ash@gmail.com>
wrote:

>  Hello all,
> Sometimes we observer that replication is not working at HBase-1.4.10
>
>     hbase07.prod.hbcluster:
>        SOURCE: PeerID=lp_analytics, AgeOfLastShippedOp=0,
> SizeOfLogQueue=1, TimeStampsOfLastShippedOp=Thu Jan 01 03:00:00 MSK 1970,
> Replication Lag=1573052815347
>        SINK  : AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Mon Oct 07
> 19:15:54 MSK 2019
>
> At logs:
>
> 2019-11-06 18:10:54,252 INFO  [hbase07:60020Replication Statistics #0]
> regionserver.Replication: Normal source for cluster lp_analytics: Total
> replicated edits: 0, current progress:
> walGroup [hbase07.prod.hbcluster%2C60020%2C1570464952456]: currently
> replicating from:
> hdfs://prodfashion01/hbase/WALs/hbase07.prod.hbcluster,60020,1570464952456/hbase07.prod.hbcluster%2C60020%2C1570464952456.1573051524020
> at position: -1
> 2019-11-06 18:15:54,252 INFO  [hbase07:60020Replication Statistics #0]
> regionserver.Replication: Normal source for cluster lp_analytics: Total
> replicated edits: 0, current progress:
> walGroup [hbase07.prod.hbcluster%2C60020%2C1570464952456]: currently
> replicating from:
> hdfs://prodfashion01/hbase/WALs/hbase07.prod.hbcluster,60020,1570464952456/hbase07.prod.hbcluster%2C60020%2C1570464952456.1573051524020
> at position: -1
>
> I can’t find any errors or something that could help me to diagnose why
> replication not working at this node. At other nodes replication works like
> a charm.
> Any ideas what’s is wrong?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message