hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6889) Make RPC to have an option to timeout
Date Thu, 29 Jul 2010 23:55:16 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893863#action_12893863
] 

Todd Lipcon commented on HADOOP-6889:
-------------------------------------

Hi Hairong. I agree that it should be per-proxy, and also that this is useful. We have a customer
who has occasionally run into this with problematic "stuck" datanodes blocking pipeline recovery
indefinitely (at least until ops notices the hung machine and powers it down, causing TCP
connection to drop).

> Make RPC to have an option to timeout
> -------------------------------------
>
>                 Key: HADOOP-6889
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6889
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: ipc
>    Affects Versions: 0.22.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.22.0, 0.20-append
>
>
> Currently Hadoop RPC does not timeout when the RPC server is alive. What it currently
does is that a RPC client sends a ping to the server whenever a socket timeout happens. If
the server is still alive, it continues to wait instead of throwing a SocketTimeoutException.
This is to avoid a client to retry when a server is busy and thus making the server even busier.
This works great if the RPC server is NameNode.
> But Hadoop RPC is also used for some of client to DataNode communications, for example,
for getting a replica's length. When a client comes across a problematic DataNode, it gets
stuck and can not switch to a different DataNode. In this case, it would be better that the
client receives a timeout exception.
> I plan to add a new configuration ipc.client.max.pings that specifies the max number
of pings that a client could try. If a response can not be received after the specified max
number of pings, a SocketTimeoutException is thrown. If this configuration property is not
set, a client maintains the current semantics, waiting forever.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message