mina-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Goldstein Lyor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SSHD-633) Race condition in command execution and SSH_MSG_CHANNEL_SUCCESS
Date Thu, 28 Jan 2016 11:05:39 GMT

    [ https://issues.apache.org/jira/browse/SSHD-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15121233#comment-15121233

Goldstein Lyor commented on SSHD-633:

I don't understand the exact scenario of messages that you claim can lead to the race condition
you describe. I do have a few thoughts on the issue though - see below.
{quote}What I did is to start sending a huge data block without creating a thread in command's
start method. Of course this is a bit incorrect, but it easily make it fail.{quote} This is
not "a bit" but rather *entirely* incorrect. The _Command#start_ method implementation *must*
use a *separate* thread to consume the input/output streams or they will get stuck eventually
- so no surprise that they do in your code.

There are several things you should try - especially for applications that have large amounts
of data to transfer:

* Increase the number of worker threads for the SSH server
* Increase the default packet size used for channel data transfer
    SshServer sshd = ...set up the server...
    PropertyResolverUtils.updateProperty(sshd, FactoryManager.NIO_WORKERS, 8 /* try playing
with the value and see how it affects you */);
    PropertyResolverUtils.updateProperty(sshd, FactoryManager.MAX_PACKET_SIZE, 48 * KB); //
do not go beyond 65535 until version 1.2 is released...

In any case, I also recommend trying version 1.1 (to be released very soon) since many bug
fixes and optimizations have been added to it.

> Race condition in command execution and SSH_MSG_CHANNEL_SUCCESS
> ---------------------------------------------------------------
>                 Key: SSHD-633
>                 URL: https://issues.apache.org/jira/browse/SSHD-633
>             Project: MINA SSHD
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Eugene Petrenko
>            Priority: Critical
> We use the library for production. From time to time we see timeout exceptions from clients
that calls our SSH server. It was not clear what is that. Most of the time I saw command suck
reading STDIN. We use JSCH as client here. The problem was reproducing rarely.
> I connect JSCH 1.51 to SSHD server to execute a command
> JSCH client expects to read {{SSH_MSG_CHANNEL_SUCCESS}} or {{SSH_MSG_CHANNEL_FAILURE}}
as response to {{SSH_MSG_CHANNEL_REQUEST}} call to execute command. 
> SSHD implementation calls command's method {{start}} and than posts reply with success
to the channel. It may easily command thread is able to fill send window with DATA messages.
Thus {{SSH_MSG_CHANNEL_SUCCESS}} reply is not delivered. 
> This makes JSCH to wait for the message and fail on timeout. The SSHD server command
is simply suck on reading stdin.
> Here goes the code I managed to reproduce the issue. What I did is to start sending a
huge data block without creating a thread in command's start method. Of course this is a bit
incorrect, but it easily make it fail. 
> The server contains the following command implementation
> {code}
>     final OutputStream out = getOut();
>     out.write(new byte[32 * 1024 * 1024]);
>     out.flush();
>     new Thread(new Runnable() {
>       @Override
>       public void run() {
>         try {
>           getIn().read();
>           onExit(0);
>         } catch (Throwable e) {
>           onExit(1, e.getMessage());
>         }
>       }
>     }).start();
> {code} 
> Jsch client code is follows
> {code}
>         final JSch j = new JSch();
>         final Session session = j.getSession("jonnyzzz", myResource.getHostname(), myResource.getSSHPort());
>         session.setTimeout(60_000);
>         session.connect();
>         final ChannelExec e = (ChannelExec) session.openChannel("exec");
>         e.setCommand("test-buffer-underrun");
>         final InputStream inputStream = new e.getInputStream();
>         e.connect(10_000); //meaningful timeout to reproduce the bug
>         ByteStreams.copy(inputStream, FileUtil.nullOutputStream());
> {code}

This message was sent by Atlassian JIRA

View raw message