hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Stack <st...@powerset.com>
Subject RE: Regions Offline
Date Fri, 18 Apr 2008 15:23:06 GMT
(Sorry if this duplicate -- sent it from another account last night but it
doesn't seem to have shown up given my mailbox listing; one below had a few
additions)

>
>> Hi
>>
>>     My system is quite simple:
>>     - two (one quad core, one dual core) servers with 2GB mem and 150 GB
>> allocated to dfs.

Whats do you have for replication level in your hdfs?  Default is 3.  I
ain't sure what happens if only two servers in the mix.  Check namenode and
datanode logs?

>>     - I use it to crawl multiple supports but mainly filesystems and
>> save the results onto hbase (not too many files < 100.000 but rows can get
>> easily to 30 MB each)
>>
>>     I constantly getting NullPointerExceptions (on the client caused by
>> NotServingRegionExceptions on regionserver) when creating tables or
>> RegionOfflineExceptions when doing puts or sometimes just time outs.

Tell us more about these.  Paste in the stacktraces.  Please enable DEBUG
logging (See FAQ for how).

>>     When started with hbase I developed in 'local' mode, I then migrated
>> to a small dev 2 servers cluster (weaker than production is now) where I
>> tested the functionality, and it worked fine but, my bad, due to pressing
>> scheduling I didn't do any real load tests, so the system is now
>> continuously going under in production. I've only been able to do a full
>> crawl by resetting the cluster to one node and putting it in 'local' mode.
>>
>>     My question is what can cause regions to be offline in
>> regionservers?

As Bryan said, it shouldn't be happening (There was a case a while back
where it could happen but was fixed in 0.1.0 -- perhaps there is another
path that provokes this condition).

>>
>>     I ask so that I can investigate the matter further but having a
>> starting point.

Get your table all onlined -- run the little method in MetaUtils to online
any offline regions if running 'enable table' in HQL doesn't do it for you
-- and then enable DEBUG and let it run (Verify your tables are online by
running a select that will run over the full table.  I believe just
selecting on a nonexistent column will make the client look for the
nonexistent column in all regions -- if a region is offline, then this
select will fail -- else run 'select count(SOME_COLUMN) from YOUR_TABLE;').
If region goes offline again, send us the DEBUG logs from regionservers and
master.

One other thing to consider is filehandles.  Are you running w/ the usual
default of 1024?  If so, things will fail in odd ways if you have upward of
tens of regions.

St.Ack


Mime
View raw message