mina-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Florin Sultan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DIRMINA-898) A wakeup() is issued in a thread before the target select() is executed.
Date Wed, 11 Jul 2012 19:21:34 GMT

    [ https://issues.apache.org/jira/browse/DIRMINA-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411881#comment-13411881

Florin Sultan commented on DIRMINA-898:

Yes, I've run it and the interleaving looks OK now, the threads no longer hang.

I also looked a at the logic and it looks fine. I assume the semaphore can be safely released
by the child (since it has no ownership).
I assume the while / try code following the release in the child is also safe as before -
you know better.

I've also tried a run without the wait(10 ms) in the parent, and it worked ok too. But it
is probably safer to have the wait to give  the child a safe chance to execute select()..

I hope this settles it. I'll let you know if I run into problems.

On another line :

I checked the interleaving with TCP (with the OLD 2.0.4 code) and it was the same : the wakeup
did happen before the select. But it looks like the wakeup was memorized - this explains why
the old code has always worked with TCP. UDT exposed the race because it does not memorize
a wakeup when no select has been issued.

This is interesting to know in case there are places in MINA that may unknowingly rely on
similar behaviors.

For reference, here's the TCP interleaving (again, with 2.0.4) :

--- AbstractIoAcceptor::bind(Iterable): calling bindInternal
--- bindInternal: calling startupAcceptor
startupAcceptor: Starting NEW Acceptor
--- bindInternal: calling wakeup()
--- bindInternal: before await
Accepptor.run: calling select()
Accepptor.run: after select()
NioSocketAcceptor::open ServerSocketChannel().open called
--- bindInternal: after await


>  A wakeup() is issued in a thread before the target select() is executed.
> -------------------------------------------------------------------------
>                 Key: DIRMINA-898
>                 URL: https://issues.apache.org/jira/browse/DIRMINA-898
>             Project: MINA
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 2.0.4
>         Environment: Linux 2.6.18-164.11.1.el5xen #1 SMP Wed Jan 20 08:06:04 EST 2010
x86_64 x86_64 x86_64 GNU/Linux
> java version "1.6.0_31"
> Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
>            Reporter: Florin Sultan
>             Fix For: 2.0.5
>         Attachments: build-test-failure-2.0.5.txt
> Context:
> -------
> I am using BarchartUDT (BC for short in the calls traced below) as transport.
> I have integrated it with MINA release 2.0.4.
> BarchartUDT is a java wrapper to the native UDT library (C++). UDT is a reliable,
> connection-oriented, streaming protocol.
> The example below is the MINA sample timeserver that hangs in the first bind() call.
> Description: 
> -----------
> Trigger : A wakeup() is issued in a thread before the target select() is executed
> in another thread, and it is lost.
> It is not clear if this is a bug. The MINA code in AbstractPollingIoAcceptor seems to
> on the fact that the wakeup issued before a select in not lost. But the fact is that
> two threads are racing, and if the wakeup IS lost the select will remain blocked forever,
> as in the example below.
> Details:
> ------- 
> IN AbstractPollingIoAcceptor.bindInternal() the expected program order is :
> 1. startupAcceptor() -> Acceptor.run() -> select() -> BC::performSelect...
> 2. wakeup -> NioAcceptor.wakeup() -> selector.wakeup() -> BC:: wakeup
> At runtime, 2. and 1. above are reversed, so the wakeup is lost and 1. hangs as a result.
> ---------
> Thread interleaving starting with AbstractPollingIoAcceptor.bind(Iterable):
> --- AbstractIoAcceptor::bind(Iterable): calling bindInternal
> --- bindInternal: calling startupAcceptor
> startupAcceptor: Starting NEW Acceptor
> --- bindInternal: calling wakeup()
> :::BC : wakeup() sets wakeupStepCount to 1
> --- bindInternal: before await
> Accepptor.run: calling select()
> :::BC : performSelect calling doSelectSocketUDT with wakeupBaseCount 0 wakeupStepCount
> -----------
> The 2 threads waiting (one in select, the other in await) :
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.6-b01 mixed mode):
> "NioUDTSocketAcceptor-1" prio=10 tid=0x00002aaab400d000 nid=0x5f8f runnable [0x0000000040f04000]
>    java.lang.Thread.State: RUNNABLE
>         at com.barchart.udt.SocketUDT.select1(Native Method)
>         at com.barchart.udt.SocketUDT.select(SocketUDT.java:644)
>         at com.barchart.udt.nio.SelectorUDT.doSelectSocketUDT(SelectorUDT.java:406)
>         at com.barchart.udt.nio.SelectorUDT.performSelect(SelectorUDT.java:528)
>         at com.barchart.udt.nio.SelectorUDT.doSelectReally(SelectorUDT.java:575)
>         at com.barchart.udt.nio.SelectorUDT.doSelectInsideLock(SelectorUDT.java:333)
>         - locked <0x00000000eb2d3ee0> (a com.barchart.udt.nio.HelperNIOUDT$1)
>         - locked <0x00000000eb2d3ef0> (a java.util.Collections$UnmodifiableSet)
>         - locked <0x00000000eb2d3e78> (a com.barchart.udt.nio.SelectorUDT)
>         at com.barchart.udt.nio.SelectorUDT.select(SelectorUDT.java:191)
>         at com.barchart.udt.nio.SelectorUDT.select(SelectorUDT.java:181)
>         at org.apache.mina.transport.socket.nio.NioUDTSocketAcceptor.select(NioUDTSocketAcceptor.java:279)
>         at org.apache.mina.core.polling.AbstractPollingIoAcceptor$Acceptor.run(AbstractPollingIoAcceptor.jav
> a:413)
>         at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> "Low Memory Detector" daemon prio=10 tid=0x00000000454ca000 nid=0x5f8d runnable [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread1" daemon prio=10 tid=0x00000000454c7000 nid=0x5f8c waiting on condition
> 00]
>    java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread0" daemon prio=10 tid=0x00000000454c2800 nid=0x5f8b waiting on condition
> 00]
>    java.lang.Thread.State: RUNNABLE
> "Signal Dispatcher" daemon prio=10 tid=0x00000000454c0000 nid=0x5f8a waiting on condition
> 0]
>    java.lang.Thread.State: RUNNABLE
> "Finalizer" daemon prio=10 tid=0x000000004549e000 nid=0x5f89 in Object.wait() [0x0000000041cf1000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000000eb2d4220> (a java.lang.ref.ReferenceQueue$Lock)
>         at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
>         - locked <0x00000000eb2d4220> (a java.lang.ref.ReferenceQueue$Lock)
>         at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
>         at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
> "Reference Handler" daemon prio=10 tid=0x000000004549c000 nid=0x5f88 in Object.wait()
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000000eb2d00b0> (a java.lang.ref.Reference$Lock)
>         at java.lang.Object.wait(Object.java:485)
>         at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
>         - locked <0x00000000eb2d00b0> (a java.lang.ref.Reference$Lock)
> "main" prio=10 tid=0x0000000045440000 nid=0x5f86 in Object.wait() [0x0000000040644000]
>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000000eb2d42c0> (a org.apache.mina.core.service.AbstractIoAcceptor$AcceptorOperati
> onFuture)
>         at org.apache.mina.core.future.DefaultIoFuture.await0(DefaultIoFuture.java:189)
>         - locked <0x00000000eb2d42c0> (a org.apache.mina.core.service.AbstractIoAcceptor$AcceptorOperationFu
> ture)
>         at org.apache.mina.core.future.DefaultIoFuture.awaitUninterruptibly(DefaultIoFuture.java:130)
>         at org.apache.mina.core.polling.AbstractPollingIoAcceptor.bindInternal(AbstractPollingIoAcceptor.jav
> a:330)
>         at org.apache.mina.core.service.AbstractIoAcceptor.bind(AbstractIoAcceptor.java:291)
>         - locked <0x00000000eb2d4308> (a java.lang.Object)
>         at org.apache.mina.core.service.AbstractIoAcceptor.bind(AbstractIoAcceptor.java:226)
>         at MinaTimeServer.main(MinaTimeServer.java:86)
> "VM Thread" prio=10 tid=0x0000000045495000 nid=0x5f87 runnable
> "VM Periodic Task Thread" prio=10 tid=0x00000000454d5000 nid=0x5f8e waiting on condition
> JNI global references: 1193
> ------------------------------------------------------------------------------------------
> CODE - this is the code that causes it :
>     protected final Set<SocketAddress> bindInternal(
>             List<? extends SocketAddress> localAddresses) throws Exception {
>         // Create a bind request as a Future operation. When the selector
>         // have handled the registration, it will signal this future.
>         AcceptorOperationFuture request = new AcceptorOperationFuture(
>                 localAddresses);
>         // adds the Registration request to the queue for the Workers
>         // to handle 
>         registerQueue.add(request);
>         // creates the Acceptor instance and has the local
>         // executor kick it off.
> System.out.println("--- bindInternal: calling startupAcceptor");
>         startupAcceptor(); 
>         // As we just started the acceptor, we have to unblock the select()
>         // in order to process the bind request we just have added to the
>         // registerQueue.
> System.out.println("--- bindInternal: calling wakeup()");
>         wakeup();
> System.out.println("--- bindInternal: before await");
>         // Now, we wait until this request is completed.
>         request.awaitUninterruptibly();
> The implicit assumption here seems to be that either :
> 1. The acceptor thread will run and call select() before this thread (the parent thread)
> calls wakep();
> OR
> 2. The wakeup() is memorized and, even if the acceptor thread runs after the wakeup()
> was issued, its select() will be interrupted.
> Please let me know if this is a race bug, or if the code implicitly relies on #2 above.
> If not, then there should be some sort of synchronization between these 2 threads,
> to prevent the select in the acceptor being called before the wakeup in the parent thread.
> Related question: Are there other places in MINA that may exhibit this behavior ?
> Thanks.
> Florin

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message