spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wang Shuo (Jira)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-30285) Fix deadlock between LiveListenerBus#stop and AsyncEventQueue#removeListenerOnError
Date Wed, 18 Dec 2019 02:17:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-30285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Wang Shuo updated SPARK-30285:
------------------------------
    Description: 
There is a deadlock between LiveListenerBus#stop and AsyncEventQueue#removeListenerOnError.

we can reproduce as follows:
 # Post some events to LiveListenerBus
 # Call LiveListenerBus#stop and hold the synchronized lock of bus, waiting until all the
events are processed by listeners, then remove all the queues
 # Event queue would drain out events by posting to its listeners. If a listener is interrupted,
it will call AsyncEventQueue#removeListenerOnError,  inside it will call bus.removeListener,
trying to acquire synchronized lock of bus, resulting in deadlock

  was:
There is a race condition between LiveListenerBus#stop and AsyncEventQueue#removeListenerOnError.

we can reproduce as follows:
 # Post some events to LiveListenerBus
 # Call LiveListenerBus#stop and hold the synchronized lock of bus, waiting until all the
events are processed by listeners, then remove all the queues
 # Event queue would drain out events by posting to its listeners. If a listener is interrupted,
it will call AsyncEventQueue#removeListenerOnError,  inside it will call bus.removeListener,
trying to acquire synchronized lock of bus, resulting in deadlock


> Fix deadlock between LiveListenerBus#stop and AsyncEventQueue#removeListenerOnError
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-30285
>                 URL: https://issues.apache.org/jira/browse/SPARK-30285
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.3.0, 2.4.0
>            Reporter: Wang Shuo
>            Priority: Major
>
> There is a deadlock between LiveListenerBus#stop and AsyncEventQueue#removeListenerOnError.
> we can reproduce as follows:
>  # Post some events to LiveListenerBus
>  # Call LiveListenerBus#stop and hold the synchronized lock of bus, waiting until all
the events are processed by listeners, then remove all the queues
>  # Event queue would drain out events by posting to its listeners. If a listener is interrupted,
it will call AsyncEventQueue#removeListenerOnError,  inside it will call bus.removeListener,
trying to acquire synchronized lock of bus, resulting in deadlock



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message