trafodion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Selva Govindarajan <selva.govindara...@esgyn.com>
Subject RE: odbc and/or hammerdb logs
Date Wed, 16 Sep 2015 13:39:57 GMT
Thanks for creating the JIRA Trafodion-1492.  The error is similar to
scenario-2. The process tdm_udrserv dumped core. We will look into the core
file. In the meantime, can you please do the following:

Bring the Trafodion instance down
echo $MY_SQROOT -- shows Trafodion installation directory
Remove $MY_SQROOT/etc/ms.env from all nodes


Start a New Terminal Session so that new Java settings are in place
Login as a Trafodion user
cd <trafodion_installation_directory>
. ./sqenv.sh  (skip this if it is done automatically upon logon)
sqgen

Exit and Start a New Terminal Session
Restart the Trafodion instance and check if you are seeing the issue with
tdm_udrserv again. We wanted to ensure that the trafodion processes are free
of JAVA installation mixup in your earlier message. We suspect that can
cause tdm_udrserv process  to dump core.


Selva

-----Original Message-----
From: Radu Marias [mailto:radumarias@gmail.com]
Sent: Wednesday, September 16, 2015 5:40 AM
To: dev <dev@trafodion.incubator.apache.org>
Subject: Re: odbc and/or hammerdb logs

I'm seeing this in hammerdb logs, I assume is due to the crash and some
processes are stopped:

Error in Virtual User 1: [Trafodion ODBC Driver][Trafodion Database] SQL
ERROR:*** ERROR[2034] $Z0106BZ:16: Operating system error 201 while
communicating with server process $Z010LPE:23. [2015-09-16 12:35:33]
[Trafodion ODBC Driver][Trafodion Database] SQL ERROR:*** ERROR[8904] SQL
did not receive a reply from MXUDR, possibly caused by internal errors when
executing user-defined routines. [2015-09-16 12:35:33]

$ sqcheck
Checking if processes are up.
Checking attempt: 1; user specified max: 2. Execution time in seconds: 0.

The SQ environment is up!


Process         Configured      Actual      Down
-------         ----------      ------      ----
DTM             5               5
RMS             10              10
MXOSRVR         20              20

On Wed, Sep 16, 2015 at 3:28 PM, Radu Marias <radumarias@gmail.com> wrote:

> I've restarted hdp and trafodion and now I managed to create the
> schema and stored procedures from hammerdb. But I'm getting fails and
> dump core again by trafodion while running virtual users. For some of
> the users I sometimes see in hammerdb logs:
> Vuser 5:Failed to execute payment
> Vuser 5:Failed to execute stock level
> Vuser 5:Failed to execute new order
>
> Core files are on out last node, feel free to examine them, the files
> were dumped while getting hammerdb errors:
>
> *core.49256*
>
> *core.48633*
>
> *core.49290*
>
>
> On Wed, Sep 16, 2015 at 3:24 PM, Radu Marias <radumarias@gmail.com> wrote:
>
>> *Scenario 1:*
>>
>> I've created this issue
>> https://issues.apache.org/jira/browse/TRAFODION-1492
>> I think another fix was made related to *Committed_AS* in
>> *sql/cli/memmonitor.cpp*.
>>
>> This is a response from Narendra in a previous thread where the issue
>> was fixed to start the trafodion:
>>
>>
>>>
>>>
>>>
>>> *I updated the code: sql/cli/memmonitor.cpp, so that if
>>> /proc/meminfo does not have the ‘Committed_AS’ entry, it will ignore
>>> it. Built it and put the binary: libcli.so on the veracity box (in
>>> the $MY_SQROOT/export/lib64 directory – on all the nodes). Restarted the
>>> env and ‘sqlci’ worked fine.
>>> Was able to ‘initialize trafodion’ and create a table.*
>>
>>
>> *Scenario 2:*
>>
>> The *java -version* problem I recall we had only on the other cluster
>> with centos 7, I did't seen it on this one with centos 6.7. But a
>> change I made these days in the latter one is installing oracle *jdk
>> 1.7.0_79* as default one and is where *JAVA_HOME* points to. Before
>> that some nodes had *open-jdk* as default and others didn't have one
>> but just the one installed by path by *ambari* in
>> */usr/jdk64/jdk1.7.0_67* but which was not linked to JAVA_HOME or *java*
>> command by *alternatives*.
>>
>> *Failures is HammerDB:*
>>
>> Attached is the *trafodion.dtm.**log* from a node on which I see a
>> lot of lines like these and I assume is the *transaction conflict*
>> that you mentioned, I see these line on 4 out of 5 nodes:
>>
>> 2015-09-14 12:21:49,413 INFO dtm.HBaseTxClient: useForgotten is true
>> 2015-09-14 12:21:49,414 INFO dtm.HBaseTxClient: forceForgotten is
>> false
>> 2015-09-14 12:21:49,446 INFO dtm.TmAuditTlog: forceControlPoint is
>> false
>> 2015-09-14 12:21:49,446 INFO dtm.TmAuditTlog: useAutoFlush is false
>> 2015-09-14 12:21:49,447 INFO dtm.TmAuditTlog: ageCommitted is false
>> 2015-09-14 12:21:49,447 INFO dtm.TmAuditTlog: disableBlockCache is
>> false
>> 2015-09-14 12:21:52,229 INFO dtm.HBaseAuditControlPoint:
>> disableBlockCache is false
>> 2015-09-14 12:21:52,233 INFO dtm.HBaseAuditControlPoint: useAutoFlush
>> is false
>> 2015-09-14 12:42:57,346 INFO dtm.HBaseTxClient: Exit RET_HASCONFLICT
>> prepareCommit, txid: 17179989222
>> 2015-09-14 12:43:46,102 INFO dtm.HBaseTxClient: Exit RET_HASCONFLICT
>> prepareCommit, txid: 17179989277
>> 2015-09-14 12:44:11,598 INFO dtm.HBaseTxClient: Exit RET_HASCONFLICT
>> prepareCommit, txid: 17179989309
>>
>> What *transaction conflict* means in this case?
>>
>> On Wed, Sep 16, 2015 at 2:43 AM, Selva Govindarajan <
>> selva.govindarajan@esgyn.com> wrote:
>>
>>> Hi Radu,
>>>
>>> Thanks for using Trafodion. With the help from Suresh, we looked at
>>> the core files in your cluster. We believe that there are two
>>> scenarios that is causing the Trafodion processes to dump core.
>>>
>>> Scenario 1:
>>> Core dumped by tdm_arkesp processes. Trafodion engine has assumed
>>> the entity /proc/meminfo/Committed_AS is available in all flavors of
>>> linux.  The absence of this entity is not handled correctly by the
>>> trafodion tdm_arkesp process and hence it dumped core. Please file a
>>> JIRA using this link
>>> https://issues.apache.org/jira/secure/CreateIssue!default.jspa and
>>> choose "Apache Trafodion" as the project to report a bug against.
>>>
>>> Scenario 2:
>>> Core dumped by tdm_udrserv processes. From our analysis, this
>>> problem happened when the process attempted to create the JVM
>>> instance programmatically. Few days earlier, we have observed
>>> similar issue in your cluster when java -version command was
>>> attempted. But, java -version or $JAVA_HOME/bin/java -version works
>>> fine now.
>>> Was there any change made to the cluster recently to avoid the
>>> problem with java -version command?
>>>
>>> You can please delete all the core files in sql/scripts directory
>>> and issue the command to invoke SPJ and check if it still dumps
>>> core. We can look at the core file if it happens again. Your
>>> solution to the java -version command would be helpful.
>>>
>>> For the failures with HammerDB, can you please send us the exact
>>> error message returned by the Trafodion engine to the application.
>>> This might help us to narrow down the cause. You can also look at
>>> $MY_SQROOT/logs/trafodion.dtm.log to check if any transaction
>>> conflict is causing this error.
>>>
>>> Selva
>>> -----Original Message-----
>>> From: Radu Marias [mailto:radumarias@gmail.com]
>>> Sent: Tuesday, September 15, 2015 9:09 AM
>>> To: dev <dev@trafodion.incubator.apache.org>
>>> Subject: Re: odbc and/or hammerdb logs
>>>
>>> Also noticed there are several core. files from today in
>>> */home/trafodion/trafodion-20150828_0830/sql/scripts*. If needed
>>> please provide a gmail address so I can share them via gdrive.
>>>
>>> On Tue, Sep 15, 2015 at 6:29 PM, Radu Marias <radumarias@gmail.com>
>>> wrote:
>>>
>>> > Hi,
>>> >
>>> > I'm running HammerDB over trafodion and when running virtual users
>>> > sometimes I get errors like this in hammerdb logs:
>>> > *Vuser 1:Failed to execute payment*
>>> >
>>> > *Vuser 1:Failed to execute new order*
>>> >
>>> > I'm using unixODBC and I tried to add these line in
>>> > */etc/odbc.ini* but the trace file is not created.
>>> > *[ODBC]*
>>> > *Trace = 1*
>>> > *TraceFile = /var/log/odbc_tracefile.log*
>>> >
>>> > Also tried with *Trace = yes* and *Trace = on*, I've found
>>> > multiple references for both.
>>> >
>>> > How can I see more logs to debug the issue? Can I enable logs for
>>> > all queries in trafodion?
>>> >
>>> > --
>>> > And in the end, it's not the years in your life that count. It's
>>> > the life in your years.
>>> >
>>>
>>>
>>>
>>> --
>>> And in the end, it's not the years in your life that count. It's the
>>> life in your years.
>>>
>>
>>
>>
>> --
>> And in the end, it's not the years in your life that count. It's the life
>> in your years.
>>
>
>
>
> --
> And in the end, it's not the years in your life that count. It's the life
> in your years.
>



-- 
And in the end, it's not the years in your life that count. It's the life
in your years.

Mime
View raw message