hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-7397) Allow configurable timeouts when connecting to HDFS via java FileSystem API
Date Sat, 18 Feb 2012 21:39:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211109#comment-13211109

Steve Loughran commented on HADOOP-7397:

I'd like to get this into 0.23+, but it's going to need some reworking to fit in a (changed)
source file before it will apply

Some other recommendations
* define a key for the max #of connection retries too, rather than the hard coded 45 value
which is there now (I think that may be a new feature of 0.23+)
* move both keys into the CommonConfigurationKeys

As I noted in HADOOP-3456, this is going to be hard to write a test for.
> Allow configurable timeouts when connecting to HDFS via java FileSystem API
> ---------------------------------------------------------------------------
>                 Key: HADOOP-7397
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7397
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.20.2, 0.23.0
>         Environment: Any
>            Reporter: Scott Fines
>            Priority: Minor
>              Labels: hadoop
>             Fix For: 0.24.0
>         Attachments: timeout.patch
> If the NameNode is not available (in, for example, a network partition event separating
the client from the NameNode), and an attempt is made to connect, then the FileSystem api
will *eventually* timeout and throw an error. However, that timeout is currently hardcoded
to be 20 seconds to connect, with 45 retries, for a total of a 15 minute wait before failure.
While in many circumstances this is fine, there are also many circumstances (such as booting
a service) where both the connection timeout and the number of retries should be significantly
less, so as not to harm availability of other services.
> Investigating Client.java, I see that there are two fields in Connection: maxRetries
and rpcTimeout. I propose either re-using those fields for initiating the connection as well;
alternatively, using the already existing dfs.socket.timeout parameter to set the connection
timeout on initialization, and potentially adding a new field such as dfs.connection.retries
with a default of 45 to replicate current behaviors.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message