trafodion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Broeder <sean.broe...@esgyn.com>
Subject RE: Trafodion master Daily Test Result - 332 - Still Failing
Date Thu, 15 Sep 2016 16:46:32 GMT
RE: Trafodion master Daily Test Result - 332 - Still Failing

Sure.  I’ll have a look



*From:* Narain Arvind [mailto:narain.arvind@gmail.com]
*Sent:* Thursday, September 15, 2016 9:22 AM
*To:* dev@trafodion.incubator.apache.org
*Cc:* Sean Broeder <sean.broeder@esgyn.com>
*Subject:* RE: Trafodion master Daily Test Result - 332 - Still Failing



Timeout in core-regress-executor-hdp is during drop of table T106A.

https://jenkins.esgyn.com/job/core-regress-executor-hdp/369/console

Messages related to transactional.SplitBalancerHelper repeated in Region
Server logs.

Sean could you please take a look ? Thanks.

http://traf-testlogs.esgyn.com/Daily-master/332/regress-executor-hdp/hbase-logs/hbase-hbase-regionserver-slave-ahw23.log

2016-09-15 10:25:01,806 INFO  [RS_CLOSE_REGION-slave-ahw23:16020-2]
regionserver.HRegion: Closed
TRAFODION.SCH.T106A,\x00\x00\x00\x01\x00\x00\x00\x00,1473934644524.f32b6c8d8f3932729b9824f51a95a63e.

2016-09-15 10:25:02,059 INFO
[PriorityRpcServer.handler=8,queue=0,port=16020]
transactional.SplitBalanceHelper: scannersListClear Active Scanner found,
ScannerId: 0 Txid: 1054 Region:
TRAFODION.SCH.T106A,\x00\x00\x00\x02\x00\x00\x00\x00,1473934644524.39d9ea92d6f2109a682f66f38a795b79.

…

2016-09-15 10:31:37,204 INFO  [regionserver/
slave-ahw23.trafodion.org/172.31.3.234:16020.leaseChecker]
regionserver.RSRpcServices: Scanner 12971 lease expired on region
TRAFODION.SCH.T106A,\x00\x00\x00\x01\x00\x00\x00\x00,1473934644524.f32b6c8d8f3932729b9824f51a95a63e.

2016-09-15 10:31:37,205 ERROR [regionserver/
slave-ahw23.trafodion.org/172.31.3.234:16020.leaseChecker]
regionserver.RSRpcServices: Closing scanner for
TRAFODION.SCH.T106A,\x00\x00\x00\x01\x00\x00\x00\x00,1473934644524.f32b6c8d8f3932729b9824f51a95a63e.

org.apache.hadoop.hbase.NotServingRegionException: Region
TRAFODION.SCH.T106A,\x00\x00\x00\x01\x00\x00\x00\x00,1473934644524.f32b6c8d8f3932729b9824f51a95a63e.
is not online on slave-ahw23.trafodion.org,16020,1473931541282

        at
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2898)

        at
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2875)

        at
org.apache.hadoop.hbase.regionserver.RSRpcServices$ScannerListener.leaseExpired(RSRpcServices.java:285)

        at org.apache.hadoop.hbase.regionserver.Leases.run(Leases.java:121)

        at java.lang.Thread.run(Thread.java:745)

…

2016-09-15 11:29:49,857 INFO
[PriorityRpcServer.handler=1,queue=1,port=16020]
transactional.SplitBalanceHelper: scannersListClear Active Scanner found,
ScannerId: 0 Txid: 1054 Region:
TRAFODION.SCH.T106A,\x00\x00\x00\x02\x00\x00\x00\x00,1473934644524.39d9ea92d6f2109a682f66f38a795b79.

2016-09-15 11:29:49,864 INFO
[PriorityRpcServer.handler=19,queue=1,port=16020]
transactional.SplitBalanceHelper: scannersListClear Active Scanner found,
ScannerId: 0 Txid: 1054 Region:
TRAFODION.SCH.T106A,\x00\x00\x00\x02\x00\x00\x00\x00,1473934644524.39d9ea92d6f2109a682f66f38a795b79.

>From the master logs:

http://traf-testlogs.esgyn.com/Daily-master/332/regress-executor-hdp/hbase-logs/hbase-hbase-master-slave-ahw23.log

2016-09-15 10:25:01,810 INFO  [AM.ZK.Worker-pool2-t265]
master.RegionStates: Transition {f32b6c8d8f3932729b9824f51a95a63e
state=PENDING_CLOSE, ts=1473935101056,
server=slave-ahw23.trafodion.org,16020,1473931541282}
to {f32b6c8d8f3932729b9824f51a95a63e state=OFFLINE, ts=1473935101810,
server=slave-ahw23.trafodion.org,16020,1473931541282}

2016-09-15 10:25:01,810 INFO  [AM.ZK.Worker-pool2-t265]
master.RegionStates: Offlined f32b6c8d8f3932729b9824f51a95a63e from
slave-ahw23.trafodion.org,16020,1473931541282

2016-09-15 10:26:31,059 INFO
[slave-ahw23.trafodion.org,16000,1473931529194-org.apache.hadoop.hbase.master.procedure.DisableTableProcedure$BulkDisabler-2]
master.AssignmentManager: Server slave-ahw23.trafodion.org,16020,1473931541282
returned java.io.IOException: Call to
slave-ahw23.trafodion.org/172.31.3.234:16020 failed on local exception:
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=496,
waitTime=90001, operationTimeout=90000 expired. for
TRAFODION.SCH.T106A,\x00\x00\x00\x02\x00\x00\x00\x00,1473934644524.39d9ea92d6f2109a682f66f38a795b79.,
try=1 of 10

java.io.IOException: Call to slave-ahw23.trafodion.org/172.31.3.234:16020
failed on local exception:
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=496,
waitTime=90001, operationTimeout=90000 expired.

        at
org.apache.hadoop.hbase.ipc.RpcClientImpl.wrapException(RpcClientImpl.java:1262)

        at
org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1230)

        at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)

        at
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)

        at
org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.closeRegion(AdminProtos.java:23149)

        at
org.apache.hadoop.hbase.protobuf.ProtobufUtil.closeRegion(ProtobufUtil.java:1737)

        at
org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:809)

        at
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1853)

        at
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:2571)

        at
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:2583)

        at
org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:2458)

        at
org.apache.hadoop.hbase.master.procedure.DisableTableProcedure$BulkDisabler$1.run(DisableTableProcedure.java:534)

        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.lang.Thread.run(Thread.java:745)

Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=496,
waitTime=90001, operationTimeout=90000 expired.

        at org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:70)

        at
org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1204)

        ... 13 more

>From the dtm logs:

http://traf-testlogs.esgyn.com/Daily-master/332/regress-executor-hdp/traf_run/logs/trafodion.dtm.log

2016-09-15 10:35:08,544 ERROR dtm.HBaseTxClient: Returning from
HBaseTxClient:prepareCommit, txid: 1064 retval: RET_IOEXCEPTION IOException

org.apache.hadoop.hbase.exceptions.TimeoutIOException:
java.util.concurrent.TimeoutException: The procedure 672 is still running

        at
org.apache.hadoop.hbase.client.HBaseAdmin.disableTable(HBaseAdmin.java:1205)

        at
org.apache.hadoop.hbase.client.transactional.TransactionManager.disableTable(TransactionManager.java:2959)

        at
org.apache.hadoop.hbase.client.transactional.TransactionManager.prepareCommit(TransactionManager.java:1893)

        at
org.trafodion.dtm.HBaseTxClient.prepareCommit(HBaseTxClient.java:489)

Caused by: java.util.concurrent.TimeoutException: The procedure 672 is
still running

        at
org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.waitProcedureResult(HBaseAdmin.java:4177)

        at
org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.get(HBaseAdmin.java:4098)

        at
org.apache.hadoop.hbase.client.HBaseAdmin.disableTable(HBaseAdmin.java:1201)

        ... 3 more

Regards

Arvind

-----Original Message-----
From: steve.varnau@esgyn.com [mailto:steve.varnau@esgyn.com
<steve.varnau@esgyn.com>]
Sent: Thursday, September 15, 2016 4:54 AM
To: dev@trafodion.incubator.apache.org
Subject: Trafodion master Daily Test Result - 332 - Still Failing

Daily Automated Testing master

Jenkins Job:   https://jenkins.esgyn.com/job/Check-Daily-master/332/

Archived Logs: http://traf-testlogs.esgyn.com/Daily-master/332

Bld Downloads: http://traf-builds.esgyn.com

Changes since previous daily build:

[dbirdsall] [TRAFODION-2187] Fix DROP SCHEMA CASCADE when sample tables are
present

[anoop.sharma] jira TRAFODION-2184 groupby/orderby construct extensions and
enablement

[hzeller] [TRAFODION-2222] Trafodion on HDP 2.3 needs yarn client in class
path

[anoop.sharma] jira TRAFODION-2184 additional changes



Test Job Results:

FAILURE core-regress-executor-hdp (2 hr 20 min) SUCCESS
build-rh6-master-debug (30 min) SUCCESS build-rh6-master-release (35 min)
SUCCESS core-regress-charsets-cdh (44 min) SUCCESS
core-regress-charsets-hdp (43 min) SUCCESS core-regress-compGeneral-cdh (41
min) SUCCESS core-regress-compGeneral-hdp (1 hr 1 min) SUCCESS
core-regress-core-cdh (1 hr 1 min) SUCCESS core-regress-core-hdp (1 hr 25
min) SUCCESS core-regress-executor-cdh (1 hr 14 min) SUCCESS
core-regress-fullstack2-cdh (10 min) SUCCESS core-regress-fullstack2-hdp
(21 min) SUCCESS core-regress-hive-cdh (49 min) SUCCESS
core-regress-hive-hdp (48 min) SUCCESS core-regress-privs1-cdh (50 min)
SUCCESS core-regress-privs1-hdp (53 min) SUCCESS core-regress-privs2-cdh
(51 min) SUCCESS core-regress-privs2-hdp (1 hr 20 min) SUCCESS
core-regress-qat-cdh (29 min) SUCCESS core-regress-qat-hdp (38 min) SUCCESS
core-regress-seabase-cdh (1 hr 27 min) SUCCESS core-regress-seabase-hdp (1
hr 54 min) SUCCESS core-regress-udr-cdh (25 min) SUCCESS
core-regress-udr-hdp (40 min) SUCCESS jdbc_test-cdh (30 min) SUCCESS
jdbc_test-hdp (48 min) SUCCESS phoenix_part1_T2-cdh (1 hr 7 min) SUCCESS
phoenix_part1_T2-hdp (1 hr 19 min) SUCCESS phoenix_part1_T4-cdh (1 hr 2
min) SUCCESS phoenix_part1_T4-hdp (1 hr 23 min) SUCCESS
phoenix_part2_T2-cdh (1 hr 2 min) SUCCESS phoenix_part2_T2-hdp (1 hr 19
min) SUCCESS phoenix_part2_T4-cdh (1 hr 0 min) SUCCESS phoenix_part2_T4-hdp
(1 hr 19 min) SUCCESS pyodbc_test-cdh (11 min) SUCCESS pyodbc_test-hdp (27
min)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message