Hey Andrey:
Testing 2.0.0, I ran against 2.7.x and 2.8.3. I just went back to my test
cluster and upgraded to 2.8.4 and indeed, I see master stuck initializing
waiting on the assign of hbase:meta.
2018-07-06 08:39:21,787 INFO [PEWorker-10] procedure.RecoverMetaProcedure:
pid=5, state=RUNNABLE:RECOVER_META_ASSIGN_REGIONS; RecoverMetaProcedure
failedMetaServer=null, splitWal=true; Retaining meta assignment to server=
ve0538.X.Y.Z.com,16020,1530891551115
2018-07-06 08:39:21,789 INFO [PEWorker-10] procedure2.ProcedureExecutor:
Initialized subprocedures=[{pid=6, ppid=5,
state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta,
region=1588230740, target=ve0538.X.Y.Za.com,16020,1530891551115}]
2018-07-06 08:39:21,847 INFO [PEWorker-4]
procedure.MasterProcedureScheduler: pid=6, ppid=5,
state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta,
region=1588230740, target=ve0538.X.Y.Z.com,16020,1530891551115 checking
lock on 1588230740
When I go to the RegionServer that was assigned hbase:meta and look at its
logs, I see this:
479474 2018-07-06 08:28:18,304 ERROR [RS-EventLoopGroup-1-7]
asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper: Couldn't properly
initialize access to HDFS internals. Please update your WAL Provider
to not make use of the 'asyncfs' provider. See HBASE-16110 for more
information.
479475 java.lang.NoSuchMethodException:
org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(org.apache.hadoop.fs.FileEncryptionInfo)
479476 at java.lang.Class.getDeclaredMethod(Class.java:2130)
479477 at
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.createTransparentCryptoHelper(FanOutOneBlockAsyncDFSOutputSaslHelper.java:232)
479478 at
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.<clinit>(FanOutOneBlockAsyncDFSOutputSaslHelper.java:262)
479479 at
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.initialize(FanOutOneBlockAsyncDFSOutputHelper.java:661)
479480 at
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.access$300(FanOutOneBlockAsyncDFSOutputHelper.java:118)
479481 at
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete(FanOutOneBlockAsyncDFSOutputHelper.java:720)
479482 at
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete(FanOutOneBlockAsyncDFSOutputHelper.java:715)
479483 at
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
479484 at
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
479485 at
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.access$000(DefaultPromise.java:34)
479486 at
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:431)
479487 at
org.apache.hbase.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
479488 at
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
479489 at
org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:309)
479490 at
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
479491 at
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
479492 at java.lang.Thread.run(Thread.java:748)
Do you see the above?
Setting the WAL writer back to FSHLog got me going again. I added the below
to the config:
<property>
<name>
hbase.wal.provider
</name>
<value>filesystem</value>
</property>
<property>
<name>
hbase.wal.meta_provider
</name>
<value>filesystem</value>
</property>
St.Ack
On Thu, Jul 5, 2018 at 12:39 PM Andrey Elenskiy
<andrey.elenskiy@arista.com.invalid> wrote:
> > Are there any ERROR messages in the regionservers or the master logs?
>
> Hey Sean, nothing interesting in master logs, it's just stuck initializing
> and throws 500 when trying to access via web ui:
> https://pastebin.com/mHsyhdNs
> Logs of one of the region server (sorry had to restart, but I'm fairly
> certain there were no ERRORs): https://pastebin.com/wHHVdQgH
>
> FYI, hbase 2.0.1 was working without issues with hbase 2.7.5. It's 2.8.4
> that's giving trouble and we can't go back as hdfs file format changed.
>
> > OK it is HDFS-12574, it has also been ported to 2.8.4. Let's revive
> HBASE-20244.
>
> Ha, thanks! I'll give it a try when 2.0.2 comes out.
>
> On Mon, Jul 2, 2018 at 6:10 PM, 张铎(Duo Zhang) <palomino219@gmail.com>
> wrote:
>
> > OK it is HDFS-12574, it has also been ported to 2.8.4. Let's
> > revive HBASE-20244.
> >
> > 2018-07-03 9:07 GMT+08:00 张铎(Duo Zhang) <palomino219@gmail.com>:
> >
> > > I think it is fine to just use the original hadoop jars in HBase-2.0.1
> to
> > > communicate with HDFS-2.8.4 or above?
> > >
> > > The async wal has hacked into the internal of DFSClient so it will be
> > > easily broken when HDFS upgraded.
> > >
> > > I can take a look at the 2.8.4 problem but for 3.x, there is no
> > production
> > > ready release yet so there is no plan to fix it yet.
> > >
> > > 2018-07-03 8:59 GMT+08:00 Sean Busbey <busbey@apache.org>:
> > >
> > >> That's just a warning. Checking on HDFS-11644, it's only present in
> > >> Hadoop 2.9+ so seeing a lack of it with HDFS in 2.8.4 is expected.
> > >> (Presuming you are deploying on top of HDFS and not e.g.
> > >> LocalFileSystem.)
> > >>
> > >> Are there any ERROR messages in the regionservers or the master logs?
> > >> Could you post them somewhere and provide a link here?
> > >>
> > >> On Mon, Jul 2, 2018 at 5:11 PM, Andrey Elenskiy
> > >> <andrey.elenskiy@arista.com.invalid> wrote:
> > >> > It's now stuck at Master Initializing and regionservers are
> > complaining
> > >> > with:
> > >> >
> > >> > 18/07/02 21:12:20 WARN util.CommonFSUtils: Your Hadoop installation
> > does
> > >> > not include the StreamCapabilities class from HDFS-11644, so we will
> > >> skip
> > >> > checking if any FSDataOutputStreams actually support hflush/hsync.
> If
> > >> you
> > >> > are running on top of HDFS this probably just means you have an
> older
> > >> > version and this can be ignored. If you are running on top of an
> > >> alternate
> > >> > FileSystem implementation you should manually verify that hflush and
> > >> hsync
> > >> > are implemented; otherwise you risk data loss and hard to diagnose
> > >> errors
> > >> > when our assumptions are violated.
> > >> >
> > >> > I'm guessing hbase 2.0.1 on top of 2.8.4 hasn't been ironed out
> > >> completely
> > >> > yet (at least not with stock hadoop jars) unless I'm missing
> > something.
> > >> >
> > >> > On Mon, Jul 2, 2018 at 3:02 PM, Mich Talebzadeh <
> > >> mich.talebzadeh@gmail.com>
> > >> > wrote:
> > >> >
> > >> >> You are lucky that HBASE 2.0.1 worked with Hadoop 2.8
> > >> >>
> > >> >> I tried HBASE 2.0.1 with Hadoop 3.1 and there was endless problems
> > >> with the
> > >> >> Region server crashing because WAL file system issue.
> > >> >>
> > >> >> thread - Hbase hbase-2.0.1, region server does not start on Hadoop
> > 3.1
> > >> >>
> > >> >> Decided to roll back to Hbase 1.2.6 that works with Hadoop 3.1
> > >> >>
> > >> >> HTH
> > >> >>
> > >> >> Dr Mich Talebzadeh
> > >> >>
> > >> >>
> > >> >>
> > >> >> LinkedIn * https://www.linkedin.com/profile/view?id=
> > >> >> AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> > >> >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrb
> > >> Jd6zP6AcPCCd
> > >> >> OABUrV8Pw>*
> > >> >>
> > >> >>
> > >> >>
> > >> >> http://talebzadehmich.wordpress.com
> > >> >>
> > >> >>
> > >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility
> for
> > >> any
> > >> >> loss, damage or destruction of data or any other property which
may
> > >> arise
> > >> >> from relying on this email's technical content is explicitly
> > >> disclaimed.
> > >> >> The author will in no case be liable for any monetary damages
> arising
> > >> from
> > >> >> such loss, damage or destruction.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> On Mon, 2 Jul 2018 at 22:43, Andrey Elenskiy
> > >> >> <andrey.elenskiy@arista.com.invalid> wrote:
> > >> >>
> > >> >> > <property>
> > >> >> > <name>hbase.wal.provider</name>
> > >> >> > <value>filesystem</value>
> > >> >> > </property>
> > >> >> >
> > >> >> > Seems to fix it, but would be nice to actually try the fanout
wal
> > >> with
> > >> >> > hadoop 2.8.4.
> > >> >> >
> > >> >> > On Mon, Jul 2, 2018 at 1:03 PM, Andrey Elenskiy <
> > >> >> > andrey.elenskiy@arista.com>
> > >> >> > wrote:
> > >> >> >
> > >> >> > > Hello, we are running HBase 2.0.1 with official Hadoop
2.8.4
> jars
> > >> and
> > >> >> > > hadoop 2.8.4 client (http://central.maven.org/
> > >> >> maven2/org/apache/hadoop/
> > >> >> > > hadoop-client/2.8.4/). Got the following exception on
> > regionserver
> > >> >> which
> > >> >> > > brings it down:
> > >> >> > >
> > >> >> > > 18/07/02 18:51:06 WARN concurrent.DefaultPromise: An
exception
> > was
> > >> >> > thrown by org.apache.hadoop.hbase.io
> > >> >> >
> .asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete()
> > >> >> > > java.lang.Error: Couldn't properly initialize access
to HDFS
> > >> internals.
> > >> >> > Please update your WAL Provider to not make use of the 'asyncfs'
> > >> >> provider.
> > >> >> > See HBASE-16110 for more information.
> > >> >> > > at org.apache.hadoop.hbase.io
> > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.<clinit>(
> > >> >> FanOutOneBlockAsyncDFSOutputSaslHelper.java:268)
> > >> >> > > at org.apache.hadoop.hbase.io
> > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputHelper.initialize(
> > >> >> FanOutOneBlockAsyncDFSOutputHelper.java:661)
> > >> >> > > at org.apache.hadoop.hbase.io
> > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputHelper.access$300(
> > >> >> FanOutOneBlockAsyncDFSOutputHelper.java:118)
> > >> >> > > at org.apache.hadoop.hbase.io
> > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete(
> > >> >> FanOutOneBlockAsyncDFSOutputHelper.java:720)
> > >> >> > > at org.apache.hadoop.hbase.io
> > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete(
> > >> >> FanOutOneBlockAsyncDFSOutputHelper.java:715)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > DefaultPromise.
> > >> >> notifyListener0(DefaultPromise.java:507)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > DefaultPromise.
> > >> >> notifyListeners0(DefaultPromise.java:500)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > DefaultPromise.
> > >> >> notifyListenersNow(DefaultPromise.java:479)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > DefaultPromise.
> > >> >> notifyListeners(DefaultPromise.java:420)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > >> >> DefaultPromise.trySuccess(DefaultPromise.java:104)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.channel.
> > DefaultChannelPromise.
> > >> >> trySuccess(DefaultChannelPromise.java:82)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractE
> > >> pollChannel$
> > >> >> AbstractEpollUnsafe.fulfillConnectPromise(AbstractEpollChann
> > >> el.java:638)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractE
> > >> pollChannel$
> > >> >> AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:676)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractE
> > >> pollChannel$
> > >> >> AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:552)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.channel.epoll.
> > >> >> EpollEventLoop.processReady(EpollEventLoop.java:394)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEven
> > >> tLoop.run(
> > >> >> EpollEventLoop.java:304)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > >> >> SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
> > >> >> > > at
> > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > >> >> DefaultThreadFactory$DefaultRunnableDecorator.run(
> > >> >> DefaultThreadFactory.java:138)
> > >> >> > > at java.lang.Thread.run(Thread.java:748)
> > >> >> > > Caused by: java.lang.NoSuchMethodException:
> > >> >> > org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryption
> > >> >> Key(org.apache.hadoop.fs.FileEncryptionInfo)
> > >> >> > > at java.lang.Class.getDeclaredMethod(Class.java:2130)
> > >> >> > > at org.apache.hadoop.hbase.io
> > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.
> > >> >> createTransparentCryptoHelper(FanOutOneBlockAsyncDFSOutputSa
> > >> >> slHelper.java:232)
> > >> >> > > at org.apache.hadoop.hbase.io
> > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.<clinit>(
> > >> >> FanOutOneBlockAsyncDFSOutputSaslHelper.java:262)
> > >> >> > > ... 18 more
> > >> >> > >
> > >> >> > > FYI, we don't have encryption enabled. Let me know
if you need
> > >> more
> > >> >> info
> > >> >> > > about our setup.
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> > >
> > >
> >
>
|