hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Elenskiy <andrey.elens...@arista.com>
Subject Re: HBase 2.0.1 with Hadoop 2.8.4 causes NoSuchMethodException
Date Fri, 06 Jul 2018 16:25:51 GMT
Ha, so I already set "hbase.wal.provider" to "filesystem", but didn't
figure out to set "hbase.wal.meta_provider" to "filesystem" as well.
Sean, I'm guessing this was the reason master got stuck assigning meta
region.
I had this in the logs of regionserver-3, if it's helpful:

18/07/02 22:02:21 INFO regionserver.RSRpcServices: Open
hbase:meta,,1.1588230740
18/07/02 22:02:21 INFO regionserver.RSRpcServices: Receiving OPEN for the
region:hbase:meta,,1.1588230740, which we are already trying to OPEN -
ignoring this new request for this region.

Now everything is up an running.
Thank you for the help!

On Fri, Jul 6, 2018 at 8:57 AM, Stack <stack@duboce.net> wrote:

> Hey Andrey:
>
> Testing 2.0.0, I ran against 2.7.x and 2.8.3. I just went back to my test
> cluster and upgraded to 2.8.4 and indeed, I see master stuck initializing
> waiting on the assign of hbase:meta.
>
>
> 2018-07-06 08:39:21,787 INFO  [PEWorker-10] procedure.
> RecoverMetaProcedure:
> pid=5, state=RUNNABLE:RECOVER_META_ASSIGN_REGIONS; RecoverMetaProcedure
> failedMetaServer=null, splitWal=true; Retaining meta assignment to server=
> ve0538.X.Y.Z.com,16020,1530891551115
> 2018-07-06 08:39:21,789 INFO  [PEWorker-10] procedure2.ProcedureExecutor:
> Initialized subprocedures=[{pid=6, ppid=5,
> state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta,
> region=1588230740, target=ve0538.X.Y.Za.com,16020,1530891551115}]
> 2018-07-06 08:39:21,847 INFO  [PEWorker-4]
> procedure.MasterProcedureScheduler: pid=6, ppid=5,
> state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta,
> region=1588230740, target=ve0538.X.Y.Z.com,16020,1530891551115 checking
> lock on 1588230740
>
> When I go to the RegionServer that was assigned hbase:meta and look at its
> logs, I see this:
>
> 479474 2018-07-06 08:28:18,304 ERROR [RS-EventLoopGroup-1-7]
> asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper: Couldn't properly
> initialize access to HDFS internals. Please update your WAL        Provider
> to not make use of the 'asyncfs' provider. See HBASE-16110 for more
> information.
> 479475 java.lang.NoSuchMethodException:
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryption
> Key(org.apache.hadoop.fs.FileEncryptionInfo)
> 479476   at java.lang.Class.getDeclaredMethod(Class.java:2130)
> 479477   at
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.
> createTransparentCryptoHelper(FanOutOneBlockAsyncDFSOutputSa
> slHelper.java:232)
> 479478   at
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSa
> slHelper.<clinit>(FanOutOneBlockAsyncDFSOutputSaslHelper.java:262)
> 479479   at
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHe
> lper.initialize(FanOutOneBlockAsyncDFSOutputHelper.java:661)
> 479480   at
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHe
> lper.access$300(FanOutOneBlockAsyncDFSOutputHelper.java:118)
> 479481   at
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHe
> lper$13.operationComplete(FanOutOneBlockAsyncDFSOutputHelper.java:720)
> 479482   at
> org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHe
> lper$13.operationComplete(FanOutOneBlockAsyncDFSOutputHelper.java:715)
> 479483   at
> org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.
> notifyListener0(DefaultPromise.java:507)
> 479484   at
> org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.
> notifyListenersNow(DefaultPromise.java:481)
> 479485   at
> org.apache.hbase.thirdparty.io.netty.util.concurrent.
> DefaultPromise.access$000(DefaultPromise.java:34)
> 479486   at
> org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise$1.run(
> DefaultPromise.java:431)
> 479487   at
> org.apache.hbase.thirdparty.io.netty.util.concurrent.
> AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
> 479488   at
> org.apache.hbase.thirdparty.io.netty.util.concurrent.
> SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
> 479489   at
> org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(
> EpollEventLoop.java:309)
> 479490   at
> org.apache.hbase.thirdparty.io.netty.util.concurrent.
> SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
> 479491   at
> org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultThreadFactory$
> DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
> 479492   at java.lang.Thread.run(Thread.java:748)
>
> Do you see the above?
>
> Setting the WAL writer back to FSHLog got me going again. I added the below
> to the config:
>
>
>     <property>
>       <name>
>         hbase.wal.provider
>       </name>
>       <value>filesystem</value>
>   </property>
>     <property>
>       <name>
>         hbase.wal.meta_provider
>       </name>
>       <value>filesystem</value>
>   </property>
>
>
>
> St.Ack
>
>
>
>
> On Thu, Jul 5, 2018 at 12:39 PM Andrey Elenskiy
> <andrey.elenskiy@arista.com.invalid> wrote:
>
> > > Are there any ERROR messages in the regionservers or the master logs?
> >
> > Hey Sean, nothing interesting in master logs, it's just stuck
> initializing
> > and throws 500 when trying to access via web ui:
> > https://pastebin.com/mHsyhdNs
> > Logs of one of the region server (sorry had to restart, but I'm fairly
> > certain there were no ERRORs): https://pastebin.com/wHHVdQgH
> >
> > FYI, hbase 2.0.1 was working without issues with hbase 2.7.5. It's 2.8.4
> > that's giving trouble and we can't go back as hdfs file format changed.
> >
> > > OK it is HDFS-12574, it has also been ported to 2.8.4. Let's revive
> > HBASE-20244.
> >
> > Ha, thanks! I'll give it a try when 2.0.2 comes out.
> >
> > On Mon, Jul 2, 2018 at 6:10 PM, 张铎(Duo Zhang) <palomino219@gmail.com>
> > wrote:
> >
> > > OK it is HDFS-12574, it has also been ported to 2.8.4. Let's
> > > revive HBASE-20244.
> > >
> > > 2018-07-03 9:07 GMT+08:00 张铎(Duo Zhang) <palomino219@gmail.com>:
> > >
> > > > I think it is fine to just use the original hadoop jars in
> HBase-2.0.1
> > to
> > > > communicate with HDFS-2.8.4 or above?
> > > >
> > > > The async wal has hacked into the internal of DFSClient so it will be
> > > > easily broken when HDFS upgraded.
> > > >
> > > > I can take a look at the 2.8.4 problem but for 3.x, there is no
> > > production
> > > > ready release yet so there is no plan to fix it yet.
> > > >
> > > > 2018-07-03 8:59 GMT+08:00 Sean Busbey <busbey@apache.org>:
> > > >
> > > >> That's just a warning. Checking on HDFS-11644, it's only present in
> > > >> Hadoop 2.9+ so seeing a lack of it with HDFS in 2.8.4 is expected.
> > > >> (Presuming you are deploying on top of HDFS and not e.g.
> > > >> LocalFileSystem.)
> > > >>
> > > >> Are there any ERROR messages in the regionservers or the master
> logs?
> > > >> Could you post them somewhere and provide a link here?
> > > >>
> > > >> On Mon, Jul 2, 2018 at 5:11 PM, Andrey Elenskiy
> > > >> <andrey.elenskiy@arista.com.invalid> wrote:
> > > >> > It's now stuck at Master Initializing and regionservers are
> > > complaining
> > > >> > with:
> > > >> >
> > > >> > 18/07/02 21:12:20 WARN util.CommonFSUtils: Your Hadoop
> installation
> > > does
> > > >> > not include the StreamCapabilities class from HDFS-11644, so
we
> will
> > > >> skip
> > > >> > checking if any FSDataOutputStreams actually support hflush/hsync.
> > If
> > > >> you
> > > >> > are running on top of HDFS this probably just means you have
an
> > older
> > > >> > version and this can be ignored. If you are running on top of
an
> > > >> alternate
> > > >> > FileSystem implementation you should manually verify that hflush
> and
> > > >> hsync
> > > >> > are implemented; otherwise you risk data loss and hard to diagnose
> > > >> errors
> > > >> > when our assumptions are violated.
> > > >> >
> > > >> > I'm guessing hbase 2.0.1 on top of 2.8.4 hasn't been ironed out
> > > >> completely
> > > >> > yet (at least not with stock hadoop jars) unless I'm missing
> > > something.
> > > >> >
> > > >> > On Mon, Jul 2, 2018 at 3:02 PM, Mich Talebzadeh <
> > > >> mich.talebzadeh@gmail.com>
> > > >> > wrote:
> > > >> >
> > > >> >> You are lucky that HBASE 2.0.1 worked with Hadoop 2.8
> > > >> >>
> > > >> >> I tried HBASE 2.0.1 with Hadoop 3.1 and there was endless
> problems
> > > >> with the
> > > >> >> Region server crashing because WAL file system issue.
> > > >> >>
> > > >> >> thread - Hbase hbase-2.0.1, region server does not start
on
> Hadoop
> > > 3.1
> > > >> >>
> > > >> >> Decided to roll back to Hbase 1.2.6 that works with Hadoop
3.1
> > > >> >>
> > > >> >> HTH
> > > >> >>
> > > >> >> Dr Mich Talebzadeh
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> LinkedIn * https://www.linkedin.com/profile/view?id=
> > > >> >> AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> > > >> >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrb
> > > >> Jd6zP6AcPCCd
> > > >> >> OABUrV8Pw>*
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> http://talebzadehmich.wordpress.com
> > > >> >>
> > > >> >>
> > > >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility
> > for
> > > >> any
> > > >> >> loss, damage or destruction of data or any other property
which
> may
> > > >> arise
> > > >> >> from relying on this email's technical content is explicitly
> > > >> disclaimed.
> > > >> >> The author will in no case be liable for any monetary damages
> > arising
> > > >> from
> > > >> >> such loss, damage or destruction.
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> On Mon, 2 Jul 2018 at 22:43, Andrey Elenskiy
> > > >> >> <andrey.elenskiy@arista.com.invalid> wrote:
> > > >> >>
> > > >> >> > <property>
> > > >> >> > <name>hbase.wal.provider</name>
> > > >> >> > <value>filesystem</value>
> > > >> >> > </property>
> > > >> >> >
> > > >> >> > Seems to fix it, but would be nice to actually try the
fanout
> wal
> > > >> with
> > > >> >> > hadoop 2.8.4.
> > > >> >> >
> > > >> >> > On Mon, Jul 2, 2018 at 1:03 PM, Andrey Elenskiy <
> > > >> >> > andrey.elenskiy@arista.com>
> > > >> >> > wrote:
> > > >> >> >
> > > >> >> > > Hello, we are running HBase 2.0.1 with official
Hadoop 2.8.4
> > jars
> > > >> and
> > > >> >> > > hadoop 2.8.4 client (http://central.maven.org/
> > > >> >> maven2/org/apache/hadoop/
> > > >> >> > > hadoop-client/2.8.4/). Got the following exception
on
> > > regionserver
> > > >> >> which
> > > >> >> > > brings it down:
> > > >> >> > >
> > > >> >> > > 18/07/02 18:51:06 WARN concurrent.DefaultPromise:
An
> exception
> > > was
> > > >> >> > thrown by org.apache.hadoop.hbase.io
> > > >> >> >
> > .asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete()
> > > >> >> > > java.lang.Error: Couldn't properly initialize access
to HDFS
> > > >> internals.
> > > >> >> > Please update your WAL Provider to not make use of the
> 'asyncfs'
> > > >> >> provider.
> > > >> >> > See HBASE-16110 for more information.
> > > >> >> > >      at org.apache.hadoop.hbase.io
> > > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.<clinit>(
> > > >> >> FanOutOneBlockAsyncDFSOutputSaslHelper.java:268)
> > > >> >> > >      at org.apache.hadoop.hbase.io
> > > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputHelper.initialize(
> > > >> >> FanOutOneBlockAsyncDFSOutputHelper.java:661)
> > > >> >> > >      at org.apache.hadoop.hbase.io
> > > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputHelper.access$300(
> > > >> >> FanOutOneBlockAsyncDFSOutputHelper.java:118)
> > > >> >> > >      at org.apache.hadoop.hbase.io
> > > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputHe
> lper$13.operationComplete(
> > > >> >> FanOutOneBlockAsyncDFSOutputHelper.java:720)
> > > >> >> > >      at org.apache.hadoop.hbase.io
> > > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputHe
> lper$13.operationComplete(
> > > >> >> FanOutOneBlockAsyncDFSOutputHelper.java:715)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > > DefaultPromise.
> > > >> >> notifyListener0(DefaultPromise.java:507)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > > DefaultPromise.
> > > >> >> notifyListeners0(DefaultPromise.java:500)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > > DefaultPromise.
> > > >> >> notifyListenersNow(DefaultPromise.java:479)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > > DefaultPromise.
> > > >> >> notifyListeners(DefaultPromise.java:420)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > > >> >> DefaultPromise.trySuccess(DefaultPromise.java:104)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.channel.
> > > DefaultChannelPromise.
> > > >> >> trySuccess(DefaultChannelPromise.java:82)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractE
> > > >> pollChannel$
> > > >> >> AbstractEpollUnsafe.fulfillConnectPromise(AbstractEpollChann
> > > >> el.java:638)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractE
> > > >> pollChannel$
> > > >> >> AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:676)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractE
> > > >> pollChannel$
> > > >> >> AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:552)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.channel.epoll.
> > > >> >> EpollEventLoop.processReady(EpollEventLoop.java:394)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEven
> > > >> tLoop.run(
> > > >> >> EpollEventLoop.java:304)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > > >> >> SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.
> java:858)
> > > >> >> > >      at
> > > >> >> > org.apache.hbase.thirdparty.io.netty.util.concurrent.
> > > >> >> DefaultThreadFactory$DefaultRunnableDecorator.run(
> > > >> >> DefaultThreadFactory.java:138)
> > > >> >> > >      at java.lang.Thread.run(Thread.java:748)
> > > >> >> > >  Caused by: java.lang.NoSuchMethodException:
> > > >> >> > org.apache.hadoop.hdfs.DFSClient.
> decryptEncryptedDataEncryption
> > > >> >> Key(org.apache.hadoop.fs.FileEncryptionInfo)
> > > >> >> > >      at java.lang.Class.getDeclaredMethod(Class.java:2130)
> > > >> >> > >      at org.apache.hadoop.hbase.io
> > > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.
> > > >> >> createTransparentCryptoHelper(FanOutOneBlockAsyncDFSOutputSa
> > > >> >> slHelper.java:232)
> > > >> >> > >      at org.apache.hadoop.hbase.io
> > > >> >> > .asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.<clinit>(
> > > >> >> FanOutOneBlockAsyncDFSOutputSaslHelper.java:262)
> > > >> >> > >      ... 18 more
> > > >> >> > >
> > > >> >> > >  FYI, we don't have encryption enabled. Let me
know if you
> need
> > > >> more
> > > >> >> info
> > > >> >> > > about our setup.
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message