maven-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arnoud Glimmerveen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SUREFIRE-1574) Communication between surefire plugin and its forks corrupted by test code
Date Fri, 02 Nov 2018 12:10:00 GMT

    [ https://issues.apache.org/jira/browse/SUREFIRE-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16673018#comment-16673018
] 

Arnoud Glimmerveen commented on SUREFIRE-1574:
----------------------------------------------

[~tibor17]: In my case the consuming of stdin in de forked process is done by the OSGi test
library Pax Exam. Each test class forks a new JVM, running an OSGi container and a probe executing
the actual test. I am just a user of this library, so I am not familiar with the exact rationale
of this library reading from stdin. I would guess it was done in an attempt to make the actual
forking of the process transparent for an external actor: _if_ someone would send something
to the stdin of the test (in my case running the process forked by the maven-surefire-plugin)
the Pax Exam library running in that process would transparently forward this to the process
it forked for running the actual OSGi based test.

I guess it is debatable who is actually at fault here. My proposal would be to ensure that
the maven-surefire-plugin is not susceptible for these kind of (mis)uses of shared resources
such as stdin/stdout and use different means to communicate between the maven-surefire-plugin
and the processes it forks.

> Communication between surefire plugin and its forks corrupted by test code
> --------------------------------------------------------------------------
>
>                 Key: SUREFIRE-1574
>                 URL: https://issues.apache.org/jira/browse/SUREFIRE-1574
>             Project: Maven Surefire
>          Issue Type: Bug
>          Components: Maven Surefire Plugin
>    Affects Versions: 2.22.0
>         Environment: Maven 3.5.4
> Surefire 2.22.0
> Pax Exam 4.12.0
>            Reporter: Arnoud Glimmerveen
>            Assignee: Tibor Digana
>            Priority: Blocker
>             Fix For: 3.0.0-M3
>
>
> In a setup with maven-surefire-plugin 2.22.0 with PAX Exam 4.12.0 I occasionally see
that the Maven build takes about 30 seconds longer than expected. The source of this additional
30 seconds is a timeout used to eventually terminate a fork if it does not terminate by itself.
> I have traced the cause of this failure to terminate to the Thread inside the fork receiving
commands from the maven plugin (silently) terminating with an IOException. This is triggered
by the validation logic inside MasterProcessCommand to check for commands without payload
(NOOP, BYE_ACK) that there is actually no data send. If some other code reads data from the
same `System.in` InputStream, the MasterProcessCommand.decode method may read data belonging
to subsequent commands.
> Relying on the assumption that the Surefire logic in the forked process is the only one
reading from the 'shared resource' `System.in` makes it vulnerable to this corruption. I see
that the catch clause of IOException in CommandReader also reports this, though I did not
see the '[SUREFIRE] std/in stream corrupted' error in my runs.
> Some of the solutions I thought of:
>  * Replacing the communication protocol (also mentioned here)
>  * Have the CommandRunnable hold an exclusive lock on `System.in` for the entire run()
(wrapping the entire run with synchronized(System.in) {}) to ensure no other thread can read
from the InputStream (as the individual read methods of BufferedInputStream used for System.in
as synchronized as well). This is a risky move though, as I guess there is no guarantee that
System.in is always implemented by BufferedInputStream.
> I have prepared a project that demonstrates this behaviour: [https://github.com/glimmerveen/fork-test]
. The timeout behaviour is visible as the test results are reported on the stdout of Maven,
but the build process does not continue for another 30 seconds or so. Note that the corruption
does not always happen.
> Note that I reported this as blocking not necessarily because of the additional time
it takes for the build to execute, but due to the fact that when the fork is terminated by
the timeout, frameworks like JaCoCo don't get the opportunity to output their results.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message