hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: Question about dead datanode
Date Mon, 24 Feb 2014 14:23:39 GMT
that's a very old version of cloudera's branch you are working with there;
patching that is not a good way to go, as you are on the slippery slope of
having your own private branch and all the costs of it.

It looks like dead node logic has -> DFSInputStream, where it is still

  /* XXX Use of CocurrentHashMap is temp fix. Need to fix
   * parallel accesses to DFSInputStream (through ptreads) properly */
  private final ConcurrentHashMap<DatanodeInfo, DatanodeInfo> deadNodes =
             new ConcurrentHashMap<DatanodeInfo, DatanodeInfo>();

This implies the problem still exists -and the opportunity to fix it -but
you will need to modify your patch to apply to hadoop trunk, ideally think
of a test, then submit a patch to the HDFS project on JIRA.

On 19 February 2014 04:48, Stack <stack@duboce.net> wrote:

> On Sat, Feb 15, 2014 at 8:01 PM, Jack Levin <magnito@gmail.com> wrote:
> > Looks like I patched it in DFSClient.java, here is the patch:
> > https://gist.github.com/anonymous/9028934
> >
> > ....
> > I moved 'deadNodes' list outside as global field that is accessible by
> > all running threads, so at any point datanode does go down, each
> > thread is basically informed that the datanode _is_ down.
> >
> We need to add something like this to current versions of DFSClient, a
> global status, so each stream does not have to discover bad DNs for itself.
> St.Ack

NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message