hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristoffer Sjögren <sto...@gmail.com>
Subject Re: Stuck closing region / region is flushing
Date Sat, 14 Mar 2015 21:58:03 GMT
I think I found the thread that is stuck. Is restarting the server harmless
in this state?

"RS_CLOSE_REGION-hdfs-ix03.se-ix.delta.prod,60020,1424687995350-1" prio=10
tid=0x00007f75a0008000 nid=0x23ee in Object.wait() [0x00007f757d30b000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at
org.apache.hadoop.hdfs.DFSOutputStream.waitAndQueueCurrentPacket(DFSOutputStream.java:1411)
- locked <0x00000007544573e8> (a java.util.LinkedList)
at
org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:1479)
- locked <0x0000000756780218> (a org.apache.hadoop.hdfs.DFSOutputStream)
at
org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:173)
at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:116)
at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:102)
- locked <0x0000000756780218> (a org.apache.hadoop.hdfs.DFSOutputStream)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
- locked <0x00000007543ef268> (a
org.apache.hadoop.hdfs.client.HdfsDataOutputStream)
at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:1061)
at
org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.writeHeaderAndData(HFileBlock.java:1047)
at
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateBlock(HFileBlockIndex.java:952)
at
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIntermediateLevel(HFileBlockIndex.java:935)
at
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:844)
at
org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:403)
at
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:1272)
at
org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:835)
- locked <0x000000075d8b2110> (a java.lang.Object)
at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:746)
at
org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store.java:2348)
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1580)
at
org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1479)
at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:992)
at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:956)
- locked <0x000000075d97b628> (a java.lang.Object)
at
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:119)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


On Sat, Mar 14, 2015 at 9:43 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> bq. flush the region manually using shell?
>
> I doubt that would work - you can give it a try.
> Please take jstack of region server in case you need to restart the server.
>
> BTW HBASE-10499 didn't go into 0.94 (maybe it should have). Please consider
> upgrading.
>
> Cheers
>
> On Sat, Mar 14, 2015 at 1:30 PM, Kristoffer Sjögren <stoffe@gmail.com>
> wrote:
>
> > Hi Ted
> >
> > Sorry I forgot to mention, hbase-0.94.6 cdh 4.4.
> >
> > Yeah, it was a pretty write intensive scenario that I think triggered it
> > (importing a lot of datapoints into opentsdb).
> >
> > Do I flush the region manually using shell?
> >
> > Cheers,
> > -Kristoffer
> >
> > On Sat, Mar 14, 2015 at 9:22 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > Which release of HBase are you using ?
> > >
> > > I wonder if your cluster was hit with HBASE-10499.
> > >
> > > Cheers
> > >
> > > On Sat, Mar 14, 2015 at 1:13 PM, Kristoffer Sjögren <stoffe@gmail.com>
> > > wrote:
> > >
> > > > Hi
> > > >
> > > > It seems one of our region servers has been stuck closing a region
> for
> > > > almost 22 hours. Puts or gets eventually fail with an exception [1].
> > > >
> > > > Is there any safe way to release the region like restarting the
> region
> > > > server?
> > > >
> > > > Cheers,
> > > > -Kristoffer
> > > >
> > > >
> > > > [1]
> > > >
> > > > 2015-03-14 21:02:24,316 INFO
> > > org.apache.hadoop.hbase.regionserver.HRegion:
> > > > Failed to unblock updates for region
> > > > tsdb,\x00\x00\x9ETU\xAC@
> > > >
> > >
> >
> \x00\x00\x01\x00\x00\xAD\x00\x00\x05\x00\x00\xA7,1426282871862.4512f92b3d81e9142542d3b458223b63.
> > > > 'IPC Server handler 9 on 60020' in 60000ms. The region is still busy.
> > > > 2015-03-14 21:02:24,316 ERROR
> > > > org.apache.hadoop.hbase.regionserver.HRegionServer:
> > > > org.apache.hadoop.hbase.RegionTooBusyException: region is flushing
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2731)
> > > > at
> org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2002)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:2114)
> > > > at sun.reflect.GeneratedMethodAccessor109.invoke(Unknown Source)
> > > > at
> > > >
> > > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > > at java.lang.reflect.Method.invoke(Method.java:606)
> > > > at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
> > > > at
> > > >
> > >
> >
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428)
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message