hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Deadlocked Regionserver process
Date Thu, 14 Jul 2011 21:59:05 GMT
What Lohit says but also, what jvm are you running and what options
are you feeding it?  The stack trace is a little crazy (especially the
mix in of resource bundle loading).  We saw something similar over in
HBASE-3830 when someone was running profiler.  Is that what is going
on here?

Thanks,
St.Ack

On Thu, Jul 14, 2011 at 11:36 AM, Matt Davies <matt.davies@tynt.com> wrote:
> Hey everyone,
>
> We periodically see a situation where the regionserver process exists in the
> process list, zookeeper thread sends the keepalive so the master won't
> remove it from the active list, yet the regionserver will not serve data.
>
> Hadoop(cdh3u0), HBase 0.90.3 (Apache version), under load from an internal
> testing tool.
>
>
> I've taken a jstack of the process and found this:
>
> Found one Java-level deadlock:
> =============================
> "IPC Server handler 99 on 60020":
>  waiting to lock monitor 0x0000000047f97000 (object 0x00002aaab8ef07e8, a
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher),
>  which is held by "IPC Server handler 64 on 60020"
> "IPC Server handler 64 on 60020":
>  waiting for ownable synchronizer 0x00002aaab8eea130, (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync),
>  which is held by "regionserver60020.cacheFlusher"
> "regionserver60020.cacheFlusher":
>  waiting to lock monitor 0x0000000047f97000 (object 0x00002aaab8ef07e8, a
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher),
>  which is held by "IPC Server handler 64 on 60020"
>
> Java stack information for the threads listed above:
> ===================================================
> "IPC Server handler 99 on 60020":
>        at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:434)
>        - waiting to lock <0x00002aaab8ef07e8> (a
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:2529)
>        at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> "IPC Server handler 64 on 60020":
>        at sun.misc.Unsafe.park(Native Method)
>        - parking to wait for  <0x00002aaab8eea130> (a
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>        at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
>        at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
>        at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
>        at
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
>        at
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
>        at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.reclaimMemStoreMemory(MemStoreFlusher.java:435)
>        - locked <0x00002aaab8ef07e8> (a
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
>        at
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:2529)
>        at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
>        at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> "regionserver60020.cacheFlusher":
>        at java.util.ResourceBundle.endLoading(ResourceBundle.java:1506)
>        - waiting to lock <0x00002aaab8ef07e8> (a
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher)
>        at java.util.ResourceBundle.findBundle(ResourceBundle.java:1379)
>        at java.util.ResourceBundle.findBundle(ResourceBundle.java:1292)
>        at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1234)
>        at java.util.ResourceBundle.getBundle(ResourceBundle.java:832)
>        at sun.util.resources.LocaleData$1.run(LocaleData.java:127)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at sun.util.resources.LocaleData.getBundle(LocaleData.java:125)
>        at
> sun.util.resources.LocaleData.getTimeZoneNames(LocaleData.java:97)
>        at
> sun.util.TimeZoneNameUtility.getBundle(TimeZoneNameUtility.java:115)
>        at
> sun.util.TimeZoneNameUtility.retrieveDisplayNames(TimeZoneNameUtility.java:80)
>        at java.util.TimeZone.getDisplayNames(TimeZone.java:399)
>        at java.util.TimeZone.getDisplayName(TimeZone.java:350)
>        at java.util.Date.toString(Date.java:1025)
>        at java.lang.String.valueOf(String.java:2826)
>        at java.lang.StringBuilder.append(StringBuilder.java:115)
>        at
> org.apache.hadoop.hbase.regionserver.PriorityCompactionQueue$CompactionRequest.toString(PriorityCompactionQueue.java:114)
>        at java.lang.String.valueOf(String.java:2826)
>        at java.lang.StringBuilder.append(StringBuilder.java:115)
>        at
> org.apache.hadoop.hbase.regionserver.PriorityCompactionQueue.addToRegionsInQueue(PriorityCompactionQueue.java:145)
>        - locked <0x00002aaab8f2dc58> (a java.util.HashMap)
>        at
> org.apache.hadoop.hbase.regionserver.PriorityCompactionQueue.add(PriorityCompactionQueue.java:188)
>        at
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.requestCompaction(CompactSplitThread.java:140)
>        - locked <0x00002aaab8894048> (a
> org.apache.hadoop.hbase.regionserver.CompactSplitThread)
>        at
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.requestCompaction(CompactSplitThread.java:118)
>        - locked <0x00002aaab8894048> (a
> org.apache.hadoop.hbase.regionserver.CompactSplitThread)
>        at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:393)
>        at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:366)
>        at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:240)
>
>
> Any ideas on how I could prevent this or let the master know about it? I've
> written an app that will check all regionservers periodically for such a
> lockup, but I can't run it constantly.
>
> I can provide more of the jstack if that is helpful.
>
> -Matt
>

Mime
View raw message