nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From michael_cafare...@comcast.net
Subject Re: [Nutch-dev] Re: NameNode scalibility
Date Mon, 07 Mar 2005 22:38:12 GMT

  Hi,

  This is very interesting, thanks, Angel.

  Doug's right about the datanode startup and replication problem.  I believe there's a simple
fix for the problem you describe when starting up all the datanodes.

  He's also probably right about the namenode startup.  A Namenode logs all its activity while
running.  Upon startup, it reads any log files that have been hanging around and creates a
checkpoint.  If you are creating a very large number of files and then restarting, this checkpoint
could take a long time to compute.  We might fix this by computing checkpoints every X operations,
instead of just at namenode startup.  

  However, the bigger problem is just the number of files you're using.  We've performed a
lot of tests with systems that have very large files (>100 gigs long), but the actual number
of files has always been pretty small.  NDFS was designed to handle systems that generate
huge files, like Nutch crawls, but there generally are not many filenames.  There are probably
a number of places in the namenode code where we assume the number of files is not large.

  That said, there's nothing inherent about the design of NDFS that makes it hard to tackle
large numbers of files.  We have just not yet worked much on this case.

  OK, but what can you do right now to improve things?
  1)  Start the namenode first.  Wait for it to quiet down.  Then start datanodes.
  2)  Try raising the number of threads dedicated to RPC.  Check out NDFS, line 72, where
the Server initializes its parent, the net.nutch.ipc.Server
  3)  We'll have a fix for the Datanode replication problem shortly.  You might try again
with the new code.

  --Mike


> Thanks for the report!
> 
> 400,000 is a larger number of files than I have yet tested NDFS with, 
> and it looks like there are some issues caused by this.  Mike has built 
> the largest NDFS systems that I know of (several terabytes spread over 
> around 20 machines) but these probably had less than a thousand files.
> 
> The namenode logs events and replays the log at startup.  It should also 
> periodically checkpoint the full set of names, so that startup is 
> faster, to avoid replaying the log of events.  Can you tell whether the 
> slow startup is due to replaying of a log, or reloading a checkpoint?
> 
> There is a known problem that, when data is not replicated sufficiently, 
> the system does not throttle replication.  It sounds like this is what 
> is happening when you start your datanodes.  When only a fraction of 
> them have started the system thinks there are lots of unreplicated 
> chunks and frantically trys to replicate them all at once.  I wonder if 
> the system would start more smoothly if you started the datanodes first, 
> then the namenode?
> 
> Mike is the primary author of NDFS.  Mike, do you have more ideas or 
> suggestions about how to help Angel?
> 
> Doug
> 
> Angel Faus wrote:
> > Hi,
> > 
> > I have been doing some tests to find out if NDFS can be used at our
> > company to reliable store many files (both small and big) across a
> > cluster of cheap servers.
> > 
> > The short summary is that right now NDFS doesn't look viable for our needs. 
> > 
> > I am sending the results of the test to the list, in case it is of any
> > interest.
> > 
> > We created about 400.000 files, each one of 1 or a small number of
> > blocks, and placed them in a cluster of 8 Datanodes (and 1 NameNode).
> > Since FSDirectory stores children as a simple Vector, we took care no
> > to create any directory with more than 100 files.
> > 
> > Performance degraded as we added more files, and we eventually ran
> > into some system limits in the NameNode (too many opened files,
> > out-of-error memory errors) that were solved the usual way (increasing
> > ulimit, adding more memory to the heap).
> > 
> > Afterwards performance degradation continued untill most connections
> > could not be established ("Problem making IPC call")
> > 
> > This happened with a fairly small image/fsimage file (just 65Mb).
> > 
> > Anyway, this allowed us to set up an installation to try NDFS.
> > 
> > This are the results for the measurements of just starting up the
> > NutchFileSystem. Measurements are taken with iostat. All 9 machines
> > are single-CPU, Xeon 2.4 Ghz with 512Mb RAM. (Linux kernel 2.4.20 with
> > reiserfs). This is nutch-0.7-dev right from the CVS.
> > 
> > --------------------
> > 1) Launch of NameNode:
> > 
> > Reading the the filenames->blocks table takes about 2 minutes with the
> > CPU at 100% and very low IO activity.
> > 
> > During that period the Namenode is not available for --report queries.
> > 
> > Afterwards CPU activity goes to 0 and --report queries work fine. 
> > 
> > This step is definitely CPU bound.
> > 
> > --------------------
> > 2) Launch of a single DataNode:
> > 
> > After launching a single Datanode (80000 blocks) the CPU of the
> > Namenode is at 13%, and CPU of Namenode is at 100% (76% sys, 24%
> > user).
> > 
> > General operations on NDFS (--report, --ls) keep working.
> > 
> > --------------------
> > 3) Launching additional DataNodes
> > 
> > Launching a second and third DataNodes maintains the pattern
> > (additional 13% CPU on NameNode for each DataNode, 100% [76% sys, 24%
> > user aprx] CPU load at the DataNode)
> > 
> > But launching the whole set (9 DataNodes) changes the situation.
> > 
> > On the NameNode
> > 
> >  * Sustained IOs of over 10.000 blk_read/s and 4.000 blk_writn/s. (no
> > significant IO activity before).
> >  
> >  * iowait gets to 90%. idle time sinks to 0%
> >  
> >  * Simple --report or -ls / queries to NDFS fail or require many
> > trials before success. ("Problem making IPC call...")
> > 
> > --------------------
> > 
> > So, it's enough to just start up the whole set of DataNodes to
> > effectively make the NDFS unavailable. No actual activity (from the fs
> > user) is needed.
> > 
> > I understand the 100% CPU usage while loading the filename->blocks
> > table, but... ¿what can be causing such a high amount of IO in the
> > NameNode?
> > 
> > I can try to narrow this further if there is any interest: adding more
> > logging code in nutchfs, more stats, different use cases, etc.
> > 
> > On the other hand, maybe this installation is just not the problem
> > ndfs intends to address.
> > 
> > Best,
> > 
> > 
> > angel
> 
> 
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from real users.
> Discover which products truly live up to the hype. Start reading now.
> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> _______________________________________________
> Nutch-developers mailing list
> Nutch-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nutch-developers

Mime
View raw message