hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Buttler, David" <buttl...@llnl.gov>
Subject web interface is fragile?
Date Thu, 01 Apr 2010 00:51:24 GMT
Hi,
I have a small cluster (6 nodes, 1 master and 5 region server/data nodes).  Each node has
lots of memory and disk (16GB of heap dedicated to RegionServers), 4 TB of disk per node for
hdfs.
I have a table with about 1 million rows in hbase - that's all.  Currently it is split across
50 regions.
I was monitoring this with the hbase web gui and I noticed that a lot of the heap was being
used (14GB).  I was running a MR job and I was getting an error to the console that launched
the job:
Error: GC overhead limit exceeded hbase

First question: is this going to hose the whole system?  I didn't see the error in any of
the hbase logs, so I assume that it was purely a client issue.

So, naively thinking that maybe the GC had moved everything to permgen and just wasn't cleaning
up, I thought I would do a rolling restart of my region servers and see if that cleared everything
up.  The first server I killed happened to be the one that was hosting the .META. table. 
Subsequently the web gui failed.  Looking at the errors, it seems that the web gui essentially
caches the address for the meta table and blindly tries connecting on every request.  I suppose
I could restart the master, but this does not seem like desirable behavior.  Shouldn't the
cache be refreshed on error?  And since there is no real code for the GUI, just a jsp page,
doesn't this mean that this behavior could be seen in other applications that use HMaster?

Corrections welcome
Dave


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message