mina-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Laxman (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DIRMINA-678) NioProcessor 100% CPU usage on Linux (epoll selector bug)
Date Tue, 21 Feb 2012 12:27:35 GMT

    [ https://issues.apache.org/jira/browse/DIRMINA-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13212541#comment-13212541
] 

Laxman commented on DIRMINA-678:
--------------------------------

My apologies for initiating discussion on very old issue.

We encountered this issue in Hadoop (HDFS and Mapeduce) as well.
Finally, I was redirected to this patch. After going through the patch, I feel there is a
problem.

{code}
+                if ((((channel instanceof DatagramChannel) && ((DatagramChannel)
channel)
+                        .isConnected()))
+                        || ((channel instanceof SocketChannel) && ((SocketChannel)
channel)
+                                .isConnected())) {
+                    // The channel is not connected anymore. Cancel
+                    // the associated key then.
+                    key.cancel();
+
+                    // Set the flag to true to avoid a selector switch
+                    brokenSession = true;
+                }
{code}

IMO, the comments and code are conflicting here.

Comments says "channel is not connected" and code checks whether "channel is connected".

Am I reading something wrong here?
                
> NioProcessor 100% CPU usage on Linux (epoll selector bug)
> ---------------------------------------------------------
>
>                 Key: DIRMINA-678
>                 URL: https://issues.apache.org/jira/browse/DIRMINA-678
>             Project: MINA
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 2.0.0-M4
>         Environment: CentOS 5.x, 32/64-bit, 32/64-bit Sun JDK 1.6.0_12, also _11/_10/_09
and Sun JDK 1.7.0 b50, Kernel 2.6.18-92.1.22.el5 and also older versions,
>            Reporter: Serge Baranov
>             Fix For: 2.0.3
>
>         Attachments: mina-2.0.3.diff, snap973.png, snap974.png
>
>
> It's the same bug as described at http://jira.codehaus.org/browse/JETTY-937 , but affecting
MINA in the very similar way.
> NioProcessor threads start to eat 100% resources per CPU. After 10-30 minutes of running
depending on the load (sometimes after several hours) one of the NioProcessor starts to consume
all the available CPU resources probably spinning in the epoll select loop. Later, more threads
can be affected by the same issue, thus 100% loading all the available CPU cores.
> Sample trace:
> NioProcessor-10 [RUNNABLE] CPU time: 5:15
> sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int)
> sun.nio.ch.EPollArrayWrapper.poll(long)
> sun.nio.ch.EPollSelectorImpl.doSelect(long)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(long)
> sun.nio.ch.SelectorImpl.select(long)
> org.apache.mina.transport.socket.nio.NioProcessor.select(long)
> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run()
> org.apache.mina.util.NamePreservingRunnable.run()
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker)
> java.util.concurrent.ThreadPoolExecutor$Worker.run()
> java.lang.Thread.run()
> It seems to affect any NIO based Java server applications running in the specified environment.
> Some projects provide workarounds for similar JDK bugs, probably MINA can also think
about a workaround.
> As far as I know, there are at least 3 users who experience this issue with Jetty and
all of them are running CentOS (some distribution default setting is a trigger?). As for MINA,
I'm not aware of similar reports yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message