atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Madhan Neethiraj (Jira)" <j...@apache.org>
Subject [jira] [Resolved] (ATLAS-1720) Add titan storage.lock.wait-time for Berkley DB to fix intermittent IT failures
Date Mon, 18 May 2020 05:42:00 GMT

     [ https://issues.apache.org/jira/browse/ATLAS-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Madhan Neethiraj resolved ATLAS-1720.
-------------------------------------
    Resolution: Abandoned

Atlas doesn't use Titan any more.

> Add titan storage.lock.wait-time for Berkley DB to fix intermittent IT failures 
> --------------------------------------------------------------------------------
>
>                 Key: ATLAS-1720
>                 URL: https://issues.apache.org/jira/browse/ATLAS-1720
>             Project: Atlas
>          Issue Type: Bug
>          Components:  atlas-core
>    Affects Versions: 1.0.0, trunk
>            Reporter: Sarath Subramanian
>            Assignee: Sarath Subramanian
>            Priority: Major
>
> Some of the ITs in Atlas fail intermittently with exception - "Could not execute operation
due to backend exception"
> Upon investigation it's found this is due to Berkley LockTimeoutException (https://github.com/thinkaurelius/titan/issues/1113)
> The default LockTimeout for berkley db is 500 ms and if a thread (some IT) is waiting
on titan storage resource which is locked by another thread and it doesn't releases the lock
within 500ms - fails with above exception. (see error log below)
> The fix for this is to increase the storage.lock.wait-time for berkley dbor increase
the lock retry property - atlas.graph.storage.lock.retries=10.
> {code}
> Caused by: com.sleepycat.je.LockTimeoutException: (JE 5.0.73) Lock expired. Locker 1516581475
7535_NotificationHookConsumer thread-0_Txn: waited for lock on database=edgestore LockAddr:284896285
LSN=0x0/0x21d55f type=WRITE grant=WAIT_PROMOTION timeoutMillis=500 startTime=1491261268442
endTime=1491261268942
> Owners: [<LockInfo locker="1445928922 7537_qtp184901207-1038 - e015a355-d6c5-4424-b7a7-833a289aea9d_Txn"
type="READ"/>, <LockInfo locker="1516581475 7535_NotificationHookConsumer thread-0_Txn"
type="READ"/>]
> Waiters: []
> Transaction 1445928922 7537_qtp184901207-1038 - e015a355-d6c5-4424-b7a7-833a289aea9d_Txn
waits for  LockAddr:471572402 Owners:<LockInfo locker="1516581475 7535_NotificationHookConsumer
thread-0_Txn" type="WRITE"/> Waiters:[<LockInfo locker="1445928922 7537_qtp184901207-1038
- e015a355-d6c5-4424-b7a7-833a289aea9d_Txn" type="READ"/>]
> Transaction 1516581475 7535_NotificationHookConsumer thread-0_Txn owns LockAddr:471572402
<LockInfo locker="1516581475 7535_NotificationHookConsumer thread-0_Txn" type="WRITE"/>
> Transaction 1516581475 7535_NotificationHookConsumer thread-0_Txn waits for LockAddr:284896285
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message