hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Brooks <i.bro...@sensewhere.com>
Subject major compaction race condition
Date Fri, 06 Mar 2015 10:03:20 GMT
Hi,

I'm currently seeing an issue with the interaction between HBase and one of our applications
that seems to occur if a request is made against a region as its undergoing a major compaction.

The application gets a list of rowkeys from an index table then for each block of 1000 rowkeys
gets the data from the main table by calling a get with an array of the keys. This is normally
fine, the process that gets this data runs many times a second, however occasionally it causes
one of the region server (different each time) to start using 300% cpu permanently until I
restart the client application.

What I see in the regionserver logs is 

2015-03-06 07:01:49,726 INFO  [regionserver16020.leaseChecker] regionserver.HRegionServer:
Scanner 14676901 lease expired on region requestData,230000000000,1407498276182.1b8c522e55b5f6bd5b60e007fa069237.
2015-03-06 07:01:49,794 WARN  [RpcServer.reader=4,port=16020] ipc.RpcServer: RpcServer.listener,port=16020:
count of bytes read: 0
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
        at org.apache.hadoop.hbase.ipc.RpcServer.channelRead(RpcServer.java:2229)
        at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1415)
        at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:790)
        at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:581)
        at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:556)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015-03-06 07:01:49,804 INFO  [rs(##########,16020,1417531485263)-snapshot-pool11246-thread-1]
regionserver.HStore: Added hdfs://swcluster1/user/hbase/data/default/requestData/e11543982df2ce616c9efad5ce5c3784/data/04eff77d329b4f30965e94765072e5b6,
entries=48, sequenceid=352934, filesize=15.2 K
2015-03-06 07:01:49,804 INFO  [rs(##########,16020,1417531485263)-snapshot-pool11246-thread-1]
regionserver.HRegion: Finished memstore flush of ~42.2 K/43200, currentsize=0/0 for region
requestData,370000000000,1407498276182.e11543982df2ce616c9efad5ce5c3784. in 315ms, sequenceid=352934,
compaction requested=false
2015-03-06 07:01:49,805 WARN  [RpcServer.handler=86,port=16020] ipc.RpcServer: RpcServer.respondercallId:
9816700 service: ClientService methodName: Scan size: 27 connection: a.b.c.d:36465: output
error
2015-03-06 07:01:49,805 DEBUG [rs(##########,16020,1417531485263)-snapshot-pool11246-thread-1]
regionserver.HRegion: Storing region-info for snapshot.
2015-03-06 07:01:49,822 WARN  [RpcServer.handler=86,port=16020] ipc.RpcServer: RpcServer.handler=86,port=16020:
caught a ClosedChannelException, this means that the server was processing a request but the
client went away. The error message was: null
2015-03-06 07:01:49,823 WARN  [RpcServer.handler=7,port=16020] ipc.RpcServer: RpcServer.respondercallId:
9816459 service: ClientService methodName: Scan size: 27 connection: a.b.c.d:36465: output
error
2015-03-06 07:01:49,823 WARN  [RpcServer.handler=7,port=16020] ipc.RpcServer: RpcServer.handler=7,port=16020:
caught a ClosedChannelException, this means that the server was processing a request but the
client went away. The error message was: null

and the thousands of lines of 
2015-03-06 07:04:32,337 WARN  [RpcServer.handler=94,port=16020] ipc.RpcServer: RpcServer.respondercallId:
9818168 service: ClientService methodName: Scan size: 27 connection: a.b.c.d:37431: output
error
2015-03-06 07:04:32,337 WARN  [RpcServer.handler=94,port=16020] ipc.RpcServer: RpcServer.handler=94,port=16020:
caught a ClosedChannelException, this means that the server was processing a request but the
client went away. The error message was: null
2015-03-06 07:04:32,338 WARN  [RpcServer.handler=32,port=16020] ipc.RpcServer: RpcServer.respondercallId:
9818236 service: ClientService methodName: Scan size: 27 connection: a.b.c.d:37431: output
error
2015-03-06 07:04:32,338 WARN  [RpcServer.handler=32,port=16020] ipc.RpcServer: RpcServer.handler=32,port=16020:
caught a ClosedChannelException, this means that the server was processing a request but the
client went away. The error message was: null
2015-03-06 07:04:33,424 INFO  [regionserver16020.leaseChecker] regionserver.HRegionServer:
Scanner 14677148 lease expired on region requestData,230000000000,1407498276182.1b8c522e55b5f6bd5b60e007fa069237.
2015-03-06 07:04:33,531 INFO  [regionserver16020.leaseChecker] regionserver.HRegionServer:
Scanner 14677194 lease expired on region requestData,230000000000,1407498276182.1b8c522e55b5f6bd5b60e007fa069237.
2015-03-06 07:04:33,531 INFO  [regionserver16020.leaseChecker] regionserver.HRegionServer:
Scanner 14677218 lease expired on region requestData,230000000000,1407498276182.1b8c522e55b5f6bd5b60e007fa069237.
2015-03-06 07:04:33,531 INFO  [regionserver16020.leaseChecker] regionserver.HRegionServer:
Scanner 14677188 lease expired on region requestData,230000000000,1407498276182.1b8c522e55b5f6bd5b60e007fa069237.
2015-03-06 07:04:33,531 INFO  [regionserver16020.leaseChecker] regionserver.HRegionServer:
Scanner 14677182 lease expired on region requestData,230000000000,1407498276182.1b8c522e55b5f6bd5b60e007fa069237.

till i restart the client.

I'm using hbase 0.98.3 on hadoop 2.4.0

Is there a specific way to handle this in our code to prevent the regionserver permanently
trying to process and return the data?




-Ian


Mime
View raw message