hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Hbase 2.2.5 Unable to load .regioninfo from table
Date Mon, 05 Oct 2020 23:39:37 GMT
On Thu, Sep 24, 2020 at 3:32 AM Martin Braun <martin.braun@zfabrik.de>
wrote:

> Hello all,
>
> i digged a bit deeper into this:
>
> the WALPlayer is not able to replay recovered.edits files, the source code
> http://hbase.apache.org/2.2/devapidocs/src-html/org/apache/hadoop/hbase/mapreduce/WALInputFormat.html
>
> seems to expect an endtime coded into the filename:
>
>             long fileStartTime = Long.parseLong(name.substring(idx+1));
> 323            if (fileStartTime <= endTime) {
> 324              LOG.info("Found: " + file);
> 325              result.add(file);
> 326            }
> 327          } catch (NumberFormatException x) {
> 328            idx = 0;
>
> But the files in recovered.edits are named differently (just a numbers
> like 00000000000000195).
>
> I have also found also this issue:
>
> https://issues.apache.org/jira/browse/HBASE-22976
> [HBCK2] Add RecoveredEditsPlayer
>
> But what can I do now to fix this and replay the WAL files in the
> recovered edits?
>
>

On WALPlayer not emitting any status when it runs, this went in last
week HBASE-25109
Add MR Counters to WALPlayer; currently hard to tell if it is doing anything
<https://issues.apache.org/jira/browse/HBASE-25109>

Let me comment up on HBASE-22976...

S




> Any ideas?
>
> best,
> Martin
>
> > On 22. Sep 2020, at 18:38, Sean Busbey <sean.busbey@gmail.com> wrote:
> >
> > hurm. following the instructions from the reference guide works for
> > me. Is there a specific reason you're passing the
> > '--internal-classpath' flag? Do other hadoop jobs work?
> >
> > what if you submit it as a proper MR job? unfortunately the ref guide
> > is thin on explaining this atm, but it looks like:
> >
> > HADOOP_CLASSPATH="${HBASE_CONF_DIR}:$("${HBASE_HOME}/bin/hbase"
> > mapredcp)" yarn jar
> > "${HBASE_HOME}/lib/shaded-clients/hbase-shaded-mapreduce-2.2.5.jar"
> > WALPlayer some/path/to/wals/ 'some:example'
> >
> > On Tue, Sep 22, 2020 at 10:24 AM Martin Braun <martin.braun@zfabrik.de>
> wrote:
> >>
> >> Hello Sean,
> >>
> >> thank you for you quick response!
> >>
> >> Replaying the wal files would be OK-  however I am struggling using the
> WALPlayer:
> >>
> >>
> >> hbase --internal-classpath org.apache.hadoop.hbase.mapreduce.WALPlayer
> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits
> tt_ix_parent_item
> >> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/codehaus/jackson/map/JsonMappingException
> >>        at org.apache.hadoop.mapreduce.Job.getJobSubmitter(Job.java:1325)
> >>        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1336)
> >>        at
> org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1359)
> >>        at
> org.apache.hadoop.hbase.mapreduce.WALPlayer.run(WALPlayer.java:428)
> >>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> >>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> >>        at
> org.apache.hadoop.hbase.mapreduce.WALPlayer.main(WALPlayer.java:417)
> >> Caused by: java.lang.ClassNotFoundException:
> org.codehaus.jackson.map.JsonMappingException
> >>        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> >>        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
> >>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
> >>        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
> >>        ... 7 more
> >>
> >> Could you provide some hints how to use the WALPlayer correctly?
> >>
> >>
> >>
> >> best,
> >> Martin
> >>
> >>> On 22. Sep 2020, at 16:52, Sean Busbey <sean.busbey@gmail.com> wrote:
> >>>
> >>> bulk loading stuff works with hfiles. recovered.edits files are
> >>> formatted the same as WAL files rather than as HFiles. for wal files
> >>> you can use the wal replayer to ensure those edits are all present in
> >>> the table.
> >>>
> >>> IIRC there is an unknown sequence of events that can result in the
> >>> recovered edits sticking around for a region after they've already
> >>> been recovered. Presuming your use case will work for having the same
> >>> edit played multiple times (basically if you do not mess about with
> >>> cell level timestamps or keeping multiple versions around) then it
> >>> should be fine to sideline those edits and then replay them using the
> >>> wal player.
> >>>
> >>> If your use case isn't fine with that, then you can use the wal pretty
> >>> printer to examine the edits that are there and check to ensure the
> >>> cells are already in the table in a current region.
> >>>
> >>> sounds like we should update the troubleshooting tips to include some
> >>> coverage of stray recovered.edits files.
> >>>
> >>> On Tue, Sep 22, 2020 at 8:58 AM Martin Braun <martin.braun@zfabrik.de>
> wrote:
> >>>>
> >>>> Hello all,
> >>>>
> >>>> I have an issue with hbase 2.2.5 (and hadoop-2.8.5) after a full disk
> event I have 38 inconsistencies, when I do a
> >>>>
> >>>> hbase --internal-classpath hbck
> >>>>
> >>>> I get a bunch of these errors:
> >>>>
> >>>> ERROR: Orphan region in HDFS: Unable to load .regioninfo from table
> tt_ix_bizStep_inserting in hdfs dir
> hdfs://localhost:9000/hbase/data/default/tt_ix_bizStep_inserting/8a1acb499bf454b072daeee5960daa73!
> It may be an invalid format or version file.  Treating as an orphaned
> regiondir.
> >>>> ERROR: Orphan region in HDFS: Unable to load .regioninfo from table
> tt_ix_bizStep_inserting in hdfs dir
> hdfs://localhost:9000/hbase/data/default/tt_ix_bizStep_inserting/8f64025b68958ebddeb812297facdfc6!
> It may be an invalid format or version file.  Treating as an orphaned
> regiondir.
> >>>>
> >>>>
> >>>> When looking into these directories I see that there is indeed no
> .regioninfo file:
> >>>>
> >>>> hdfs dfs -ls -R
> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0
> >>>>
> >>>> drwxr-xr-x   - jenkins supergroup          0 2020-09-21 11:23
> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits
> >>>> -rw-r--r--   3 jenkins supergroup      74133 2020-09-21 11:11
> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000285
> >>>> -rw-r--r--   3 jenkins supergroup      74413 2020-09-16 19:03
> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000286
> >>>> -rw-r--r--   3 jenkins supergroup      74693 2020-09-16 19:05
> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000287
> >>>> -rw-r--r--   3 jenkins supergroup      79427 2020-09-16 18:27
> hdfs://localhost:9000/hbase/data/default/tt_ix_parent_item/ae1553c4d6140110c51c535ba1dbc1a0/recovered.edits/0000000000000000305
> >>>>
> >>>>
> >>>> The hbck2 manual  from the hbase-operator tools tells me for Orphan
> Data to read
> http://hbase.apache.org/book.html#arch.bulk.load.complete.strays, chapter
> “72.4.1. 'Adopting' Stray Data"
> >>>>
> >>>> However it seems that this is another case a completebuldload on the
> named directories seems to do nothing…
> >>>>
> >>>> A scan 'hbase:meta', {COLUMN=>'info:regioninfo’} does not show
any
> errors.
> >>>>
> >>>>
> >>>> How can I resolve these inconsistencies of the missing .regioninfo?
> >>>>
> >>>> TIA
> >>>>
> >>>> best,
> >>>> Martin
> >>>>
> >>>
> >>>
> >>> --
> >>> Sean
> >>
> >> martin.braun@zfabrik.de
> >> T:      +49 6227 3984255
> >> F:      +49 6227 3984254
> >> ZFabrik Software GmbH & Co. KG
> >> Lammstrasse 2, 69190 Walldorf
> >>
> >> Handelsregister: Amtsgericht Mannheim HRA 702598
> >> Persönlich haftende Gesellschafterin: ZFabrik Verwaltungs GmbH, Sitz
> Walldorf
> >> Geschäftsführer: Dr. H. Blohm u. Udo Offermann
> >> Handelsregister: Amtsgericht Mannheim HRB 723699
> >>
> >>
> >>
> >
> >
> > --
> > Sean
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message