trafodion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Owhadi <eric.owh...@esgyn.com>
Subject I am puzzled with a jenkins test failure on CORE/TESTRTS
Date Thu, 21 Jan 2016 21:02:23 GMT
Dear Trafodioneers,



The last 2 days  I have been hunting with the great help of Steve Arnaud, a
weird failure blocking the merge of PR 255 (predicate pushdown V2).
[TRAFODION-1662].

13 days ago, PR 255 was passing Jenkins.

6 days ago, after applying changes related to code review on PR 255 and
synching to latest master, Jenkins started failing the core/TESTRTS with a
core dump.



So I created a fake PR on a new branch, with the initial code from 13 days
ago, and merged it with latest master -> Jenkins fails with same error.
Demonstrating that the PR rework was not at root cause of this sudden wrong
behavior.



So the issue is a combination of my new code (initial or after rework) with
some changes in master that happened between 13 days ago and 6 days ago.



This failure does not happen on dev environment (tested both in debug and
release mode).



Steve was able to duplicate it on a jenkins server, and narrowed down the
condition for its apparition to the sequence of core/TEST005 followed by
core/TESTRTS

Without TEST005 as catalyst, the issue does not manifest in the Jenkins
server ether.



The stack trace at time of explosion is not very helpful and shows (in red
8926 is the const value of EXE_STAT_NOT_FOUND):

EXE_STAT_NOT_FOUND can come from 34 different code path, and unfortunately,
the structure of the code does not help narrowing down witch one of the 34
was crossed at time of death with stack trace analysis. -> I hate Murphy…

Thread 1 (Thread 0x7f26080e23c0 (LWP 1505)):

#0  0x00007f2605260625 in raise () from /lib64/libc.so.6

#1  0x00007f2605261d8d in abort () from /lib64/libc.so.6

#2  0x00007f2604d39494 in ComCondition::setSQLCODE (this=<value optimized
out>, newSQLCODE=-8926) at ../export/ComDiags.cpp:1428

#3  0x00007f2603911c56 in ExHandleErrors (qparent=..., down_entry=<value
optimized out>, matchNo=<value optimized out>, globals=<value optimized
out>, diags_in=<value optimized out>, err=4294958370, intParam1=0x0,
stringParam1=0x0, nskErr=0x0, stringParam2=0x0) at
../executor/ex_error.cpp:170

#4  0x00007f2603a01e26 in ExExeUtilGetRTSStatisticsTcb::work
(this=0x7f25f39eec58) at ../executor/ExExeUtilGetStats.cpp:4222

#5  0x00007f2603a5ee33 in ExScheduler::work (this=0x7f25f39ee7c0,
prevWaitTime=<value optimized out>) at ../executor/ExScheduler.cpp:331

#6  0x00007f2603973752 in ex_root_tcb::execute (this=0x7f25f39f4c50,
cliGlobals=0x2b70120, glob=0x7f25f39b2ca8, input_desc=0x7f25f39aa030,
diagsArea=@0x7ffd2e100750, reExecute=0) at ../executor/ex_root.cpp:1058

#7  0x00007f2604fe7654 in CliStatement::execute (this=0x7f25f39c0ea0,
cliGlobals=0x2b70120, input_desc=0x7f25f39aa030, diagsArea=<value optimized
out>, execute_state=<value optimized out>, fixupOnly=0, cliflags=0) at
../cli/Statement.cpp:4525

#8  0x00007f2604f892ac in SQLCLI_PerformTasks(CliGlobals *, ULng32,
SQLSTMT_ID *, SQLDESC_ID *, SQLDESC_ID *, Lng32, Lng32, typedef
__va_list_tag __va_list_tag *, SQLCLI_PTR_PAIRS *, SQLCLI_PTR_PAIRS *)
(cliGlobals=0x2b70120, tasks=4882, statement_id=0x3398010,
input_descriptor=0x34c0eb0, output_descriptor=0x0, num_input_ptr_pairs=0,
num_output_ptr_pairs=0, ap=0x7ffd2e1008f0, input_ptr_pairs=0x0,
output_ptr_pairs=0x0) at ../cli/Cli.cpp:3297

#9  0x00007f2604f89fe2 in SQLCLI_Exec(CliGlobals *, SQLSTMT_ID *,
SQLDESC_ID *, Lng32, typedef __va_list_tag __va_list_tag *,
SQLCLI_PTR_PAIRS *) (cliGlobals=<value optimized out>, statement_id=<value
optimized out>, input_descriptor=<value optimized out>,
num_ptr_pairs=<value optimized out>, ap=<value optimized out>,
ptr_pairs=<value optimized out>) at ../cli/Cli.cpp:3544

#10 0x00007f2604ff588b in SQL_EXEC_Exec (statement_id=0x3398010,
input_descriptor=0x34c0eb0, num_ptr_pairs=0) at ../cli/CliExtern.cpp:2074

#11 0x00007f26078bb99b in SqlCmd::doExec (sqlci_env=0x2b58c50,
stmt=0x3398010, prep_stmt=<value optimized out>, numUnnamedParams=<value
optimized out>, unnamedParamArray=<value optimized out>,
unnamedParamCharSetArray=<value optimized out>, handleError=1) at
../sqlci/SqlCmd.cpp:1786

#12 0x00007f26078bc392 in SqlCmd::do_execute (sqlci_env=0x2b58c50,
prep_stmt=0x2cefd50, numUnnamedParams=0, unnamedParamArray=0x0,
unnamedParamCharSetArray=0x0, prepcode=0) at ../sqlci/SqlCmd.cpp:2122

#13 0x00007f26078bcabd in DML::process (this=0x2cf0080,
sqlci_env=0x2b58c50) at ../sqlci/SqlCmd.cpp:2897

#14 0x00007f26078a2844 in Obey::process (this=0x2ceff60, sqlci_env=<value
optimized out>) at ../sqlci/Obey.cpp:267

#15 0x00007f26078a2844 in Obey::process (this=0x385d960, sqlci_env=<value
optimized out>) at ../sqlci/Obey.cpp:267

#16 0x00007f26078a2844 in Obey::process (this=0x4662980, sqlci_env=<value
optimized out>) at ../sqlci/Obey.cpp:267

#17 0x00007f26078ab074 in SqlciEnv::run (this=0x2b58c50, in_filename=<value
optimized out>, input_string=<value optimized out>) at
../sqlci/SqlciEnv.cpp:729

#18 0x00000000004019d2 in main (argc=2, argv=0x7ffd2e102578) at
../bin/SqlciMain.cpp:329



So if anyone have a smart idea on what could be causing this, I am very
interested. I am now going to see if debugging from the Jenkins box can
give any better insight.



BTW, I was also going to run a fake PR on just latest master without any of
the PR255 code, but Steve correctly highlighted that other PRs are ongoing
and not failing that test… so this test case is useless…



Regards,
Eric

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message