hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: region servers dying - flush request - YCSB
Date Mon, 07 Mar 2011 16:08:26 GMT
On Mon, Mar 7, 2011 at 5:43 AM, M.Deniz OKTAR <deniz.oktar@gmail.com> wrote:
> We have a 5 node cluster, 4 of them being region servers. I am running a
> custom workload with YCSB and when the data is loading (heavy insert) at
> least one of the region servers are dying after about 600000 operations.


Tell us the character of your 'custom workload' please.


> There are no abnormalities in the logs as far as I can see, the only common
> point is that all of them(in different trials, different region servers
> fail) request for a flush as the last logs, given below. .out files are
> empty. I am looking at the /var/log/hbase folder for logs. Running sun java
> 6 latest version. I couldn't find any logs that indicates a problem with
> java. Tried the tests with openjdk and had the same results.
>

Its strange that flush is the last thing in your log.  The process is
dead?  We are exiting w/o a note in logs?  Thats unusual.  We usually
scream loudly when dying.

> I have set ulimits(50000) and xceivers(20000) for multiple users and certain
> that they are correct.

The first line in an hbase log prints out the ulimit it sees.  You
might check that the hbase process for sure is picking up your ulimit
setting.


> Also in the kernel logs, there are no apparent problems.
>

(The mystery compounds)

> 2011-03-07 15:07:58,301 DEBUG
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
> requested for
> usertable,user1030079237,1299502934627.257739740f58da96d5c5ef51a7d3efc3.
> because regionserver60020.cacheFlusher; priority=3, compaction queue size=18
> 2011-03-07 15:07:58,301 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> NOT flushing memstore for region
> usertable,user1601881548,1299502135191.f8efb9aa0922fa8a6a53fc49b8155ebc.,
> flushing=false, writesEnabled=false
> 2011-03-07 15:07:58,301 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Started memstore flush for
> usertable,user1662209069,1299502135191.9fa929e6fb439843cffb604dea3f88f6.,
> current region memstore size 68.6m
> 2011-03-07 15:07:58,310 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Flush requested on
> usertable,user1601881548,1299502135191.f8efb9aa0922fa8a6a53fc49b8155ebc.
> -end of log file-
> ---
>

Nothing more?

Thanks,
St.Ack

Mime
View raw message