hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10251) Both NameNodes could be in STANDBY State if SNN network is unstable
Date Tue, 04 Feb 2014 17:56:13 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890924#comment-13890924

Hadoop QA commented on HADOOP-10251:

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 1 new or modified
test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version
1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-common-project/hadoop-common.

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3528//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3528//console

This message is automatically generated.

> Both NameNodes could be in STANDBY State if SNN network is unstable
> -------------------------------------------------------------------
>                 Key: HADOOP-10251
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10251
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.2.0
>            Reporter: Vinay
>            Assignee: Vinay
>            Priority: Critical
>         Attachments: HADOOP-10251.patch, HADOOP-10251.patch, HADOOP-10251.patch, HADOOP-10251.patch
> Following corner scenario happened in one of our cluster.
> 1. NN1 was Active and NN2 was Standby
> 2. NN2 machine's network was slow 
> 3. NN1 got shutdown.
> 4. NN2 ZKFC got the notification and trying to check for old active for fencing. (This
took little more time, again due to slow network)
> 5. In between, NN1 got restarted by our automatic monitoring, and ZKFC made it Active.
> 6. Now NN2 ZKFC got Old Active as NN2 and it did graceful fencing of NN1 to STANBY.
> 7. Before writing ActiveBreadCrumb to ZK, NN2 ZKFC got session timeout and got shutdown
before making NN2 Active.
> *Now cluster having both NameNodes as STANDBY.*
> NN1 ZKFC still thinks that its nameNode is in Active state. 
> NN2 ZKFC waiting for election.

This message was sent by Atlassian JIRA

View raw message