hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wellington Chevreuil <wellington.chevre...@gmail.com>
Subject Re: TableSnapshotInputFormat failing to delete files under recovered.edits
Date Tue, 18 Jun 2019 17:10:42 GMT
Thanks for clarifying. So given the region was already open for a while, I
guess those were just empty recovered.edits dir under the region dir, and
my previous assumption does not really apply here. I also had checked
further on TableSnapshotInputFormat, then realised it actually performs a
copy of table dir to a temporary, *restoreDir, *that should be passed as
parameter to *TableSnapshotInputFormat.setInput *initialisation method:

https://github.com/apache/hbase/blob/branch-1.4/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormat.java#L212

Note the method comments on this *restoreDir *param:


>
> *restoreDir a temporary directory to restore the snapshot into. Current
> user should   * have write permissions to this directory, and this should
> not be a subdirectory of rootdir.   * After the job is finished, restoreDir
> can be deleted.*
>

Here's the point where snapshot data get copied to restoreDir:

https://github.com/apache/hbase/blob/branch-1.4/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormatImpl.java#L509

So as long as we follow javadoc advice, our concerns about potential data
loss is not valid. I guess problem here is that when table dir is
recreated/copied to *restoreDir*, original ownership/permissions is
preserved for the subdirs, such as regions recovered.edits.


Em ter, 18 de jun de 2019 às 01:03, Jacob LeBlanc <
jacob.leblanc@microfocus.com> escreveu:

> First of all, thanks for the reply! I appreciate the time taken addressing
> our issues.
>
> > It seems the mentioned "hiccup" caused RS(es) crash(es), as you got RITs
> and recovered edits under these regions dirs.
>
> To give more context, I was making changes to increase snapshot timeout on
> region servers and did a graceful restart, so I didn't mean to crash
> anything, but it seems like I did this to too many region servers at once
> (did about half the cluster) which seemed to result in some number of
> regions getting stuck in transition. This was attempted on a live
> production cluster so the hope was to do this without downtime but it
> resulted in an outage to our application instead. Unfortunately master and
> region server logs have since rolled and aged out so I don't have them
> anymore.
>
> > The fact there was a "recovered" dir under some regions dirs means that
> when the snapshot was taken, crashed RS(es) WAL(s) had been split, but not
> completely replayed yet.
>
> Snapshot was taken many days later. File timestamps under recovered.edits
> directory were from June 6th and snapshot from the pastebin was taken on
> June 14th, but actually snapshots were taken many times with the same
> result (ETL jobs are launched at least daily in oozie). Do you mean that if
> a snapshot was taken before region was fully recovered it could result in
> this state even if snapshot was subsequently deleted?
>
> > Would you know which specific hbase version is this?
>
> It is EMR 5.22 which runs HBase 1.4.9 (with some Amazon-specific edits
> maybe? I noticed line numbers in HRegion.java in stack trace don't quite
> line up with those in the 1.4.9 tag in github).
>
> > Could your job restore the snapshot into a temp table and then read from
> this temp table using TableInputFormat, instead?
>
> Maybe we could do this, but it will take us some effort to make the
> changes, test, release, etc... Of course we'd rather not jump through hoops
> like this.
>
> > In this case, it's finding "recovered" folder under regions dir, so it
> will replay the edits there. Looks like a problem with
> TableSnapshotInputFormat, seems weird that it tries to delete edits on a
> non-staging dir (your path suggests it's trying to delete the actual edit
> folder), that could cause data loss if it would succeed to delete edits
> before RSes actually replay it.
>
> I agree that this "seems weird" to me given that I am not intimately
> familiar with all of the inner workings of hbase code. The potential data
> loss is what I'm wondering about - would data loss have occurred if we
> happened to execute our job under a user that had delete permissions in
> HDFS directories? Or did the edits actually get replayed when regions were
> in stuck and transition and the files just didn't get cleaned up? Is this
> something for which I should file a defect in JIRA?
>
> Thanks again,
>
> --Jacob LeBlanc
>
>
> -----Original Message-----
> From: Wellington Chevreuil [mailto:wellington.chevreuil@gmail.com]
> Sent: Monday, June 17, 2019 3:55 PM
> To: user@hbase.apache.org
> Subject: Re: TableSnapshotInputFormat failing to delete files under
> recovered.edits
>
> It seems the mentioned "hiccup" caused RS(es) crash(es), as you got RITs
> and recovered edits under these regions dirs. The fact there was a
> "recovered" dir under some regions dirs means that when the snapshot was
> taken, crashed RS(es) WAL(s) had been split, but not completely replayed
> yet.
>
> Since you are facing error when reading from table snapshot, and the stack
> trace shows TableSnapshotInputFormat is using "HRegion.openHRegion" code
> path to read snapshotted data, it will basically do the same as an RS would
> when trying to assign a region. In this case, it's finding "recovered"
> folder under regions dir, so it will replay the edits there. Looks like a
> problem with TableSnapshotInputFormat, seems weird that it tries to delete
> edits on a non-staging dir (your path suggests it's trying to delete the
> actual edit folder), that could cause data loss if it would succeed to
> delete edits before RSes actually replay it. Would you know which specific
> hbase version is this? Could your job restore the snapshot into a temp
> table and then read from this temp table using TableInputFormat, instead?
>
> Em seg, 17 de jun de 2019 às 17:22, Jacob LeBlanc <
> jacob.leblanc@microfocus.com> escreveu:
>
> > Hi,
> >
> > We periodically execute Spark jobs to run ETL from some of our HBase
> > tables to another data repository. The Spark jobs read data by taking
> > a snapshot and then using the TableSnapshotInputFormat class. Lately
> > we've been having some failures because when the jobs try to read the
> > data, it is trying to delete files under the recovered.edits directory
> > for some regions and the user under which we run the jobs doesn't have
> > permissions to do that. Pastebin of the error and stack trace from one
> > of our job logs is
> > here:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__pastebin.com_MAhV
> > c9JB&d=DwIFaQ&c=C5b8zRQO1miGmBeVZ2LFWg&r=-G7ASEzkT0cM96gyWHqBYm_tv-Vl8
> > sWyppvdo1zs_bg&m=FVOQFa9mNyURuYCEsxwgOABlbQ6Exqq8uj-miVRzIlo&s=yw1IpbL
> > 4ALgFshBkYBmCNskREIo_RYDvLhjWd-dJ0yU&e=
> >
> > This has started happening since upgrading to EMR 5.22 where the
> > recovered.edits directory is collocated with the WALs in HDFS where it
> > used to be in S3-backed EMRFS.
> >
> > I have two questions regarding this:
> >
> >
> > 1)      First of why are these files under the recovered.edits directory?
> > The timestamp of the files coincides with a hiccup we had with our
> > cluster where I had to use "hbase hbck -fixAssignments" to fix regions
> > that were stuck in transition. But that command seemed to work just
> > fine and all regions were assigned and there have since been no
> > inconsistencies. Does this mean the WALs were not replayed correctly?
> > Does "hbase hbck -fixAssignments" not recover regions properly?
> >
> > 2)      Why is our job trying to delete these files? I don't know enough
> > to say for sure, but it seems like using TableSnapshotInputFormat to
> > read snapshot data should not be trying recover or delete edits.
> >
> > I've fixed the problems by running "assign '<region>'" in hbase shell
> > for every region that had files under the recovered.edits directory
> > and those files seemed to be cleaned up when the assignment completed.
> > But I'd like to understand this better especially if something is
> > interfering with replaying edits from WALs (also making sure our ETL
> > jobs don't start failing would be nice).
> >
> > Thanks!
> >
> > --Jacob LeBlanc
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message