hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15486) Make NetworkTopology#netLock fair
Date Tue, 22 May 2018 21:43:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484630#comment-16484630

Arpit Agarwal commented on HADOOP-15486:

Build failure is unrelated - re-triggered Jenkins.

> Make NetworkTopology#netLock fair
> ---------------------------------
>                 Key: HADOOP-15486
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15486
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: net
>            Reporter: Nanda kumar
>            Assignee: Nanda kumar
>            Priority: Major
>         Attachments: HADOOP-15486.000.patch, HADOOP-15486.001.patch
> Whenever a datanode is restarted, the registration call after the restart received by
NameNode lands in {{NetworkTopology#add}} via {{DatanodeManager#registerDatanode}} requires
write lock on {{NetworkTopology#netLock}}. This registration thread is getting starved by
flood of {{FSNamesystem.getAdditionalDatanode}} calls, which are triggered by clients those
who were writing to the restarted datanode.
> The registration call which is waiting for write lock on {{NetworkTopology#netLock}}
is holding write lock on {{FSNamesystem#fsLock}}, causing all the other RPC calls which require
the lock on {{FSNamesystem#fsLock}} wait.
> We can make {{NetworkTopology#netLock}} lock fair so that the registration thread will
not starve.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message