hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Graham <billgra...@gmail.com>
Subject Re: HBase Not Starting after improper shutdown
Date Mon, 23 May 2011 17:10:35 GMT
Is there anything meaningful in the RS logs? I've seen situations like this
where a RS is failing to start due to issues reading the WAL. If this is the
case it would list which WAL is problematic, which is zero-length in my
experience, so I delete it from HDFS and things start up.


On Mon, May 23, 2011 at 9:16 AM, Himanish Kushary <himanish@gmail.com>wrote:

> Both the Master and hbck command prints
>
> org.apache.hadoop.hbase.NotServingRegionException:
> org.apache.hadoop.hbase.NotServingRegionException: Region is not online:
> -ROOT-,,0
>
> After the master thread exits due to the Heap Space error the hbck command
> throws:
>
> org.apache.hadoop.hbase.MasterNotRunningException
>
> Is there anyway to fix this kind of issue.We are keeping the datanodes up
> to
> see whether the under replicated blocks may be recovered.Does improper
> shutdown of the hadoop/hbase services cause this kind of issues? What
> happens in case of disaster recovery situation, how are those situaltions
> handled ?
>
> Thanks
>
>
> On Mon, May 23, 2011 at 11:36 AM, Stack <stack@duboce.net> wrote:
>
> > What does hbase hbck say?  (http://hbase.apache.org/book.html#hbck).
> >
> > What does the master log have in it?  Anything of interest.
> >
> > St.Ack
> >
> > On Mon, May 23, 2011 at 7:53 AM, Himanish Kushary <himanish@gmail.com>
> > wrote:
> > > Pressed the send button too soon...
> > >
> > > Also here is the output from hadoop fsck
> > >
> > > *Status: HEALTHY*
> > > * Total size: 37678848280 B*
> > > * Total dirs: 941*
> > > * Total files: 902 (Files currently being written: 1)*
> > > * Total blocks (validated): 1141 (avg. block size 33022654 B) (Total
> open
> > > file blocks (not validated): 1)*
> > > * Minimally replicated blocks: 1141 (100.0 %)*
> > > * Over-replicated blocks: 0 (0.0 %)*
> > > * Under-replicated blocks: 906 (79.40403 %)*
> > > * Mis-replicated blocks: 0 (0.0 %)*
> > > * Default replication factor: 2*
> > > * Average block replication: 2.0*
> > > * Corrupt blocks: 0*
> > > * Missing replicas: 1886 (82.646805 %)*
> > > * Number of data-nodes: 2*
> > > * Number of racks: 1*
> > > *FSCK ended at Mon May 23 10:51:13 EDT 2011 in 257 milliseconds*
> > > *
> > > *
> > > *
> > > *
> > > *The filesystem under path '/' is HEALTHY*
> > >
> > >
> > > Could anybody please help on how to recover from this scenario .
> > >
> > > Thanks
> > >
> > >
> > > On Mon, May 23, 2011 at 10:50 AM, Himanish Kushary <himanish@gmail.com
> > >wrote:
> > >
> > >> Hi,
> > >>
> > >> Our hbase/hadoop servers machines were shutdown without bringing the
> > hadoop
> > >> and hbase services down properly.Now when we try to bring up hbase we
> > get
> > >> the following error in the master log:
> > >>
> > >> org.apache.hadoop.hbase.NotServingRegionException: Region is not
> online:
> > >> -ROOT-,,0
> > >>
> > >> Hadoop services (namenode,jobtracker,datanode etc) have come up
> properly
> > >> and we are able to see the files in HDFS. But HBase Master keeps on
> > throwing
> > >> this exception and then finally throws a Java Heap Space error.
> > >>
> > >> Note: We have two datanodes, replication set to 2 and around 900
> blocks
> > are
> > >> shown as under-replicated.
> > >>
> > >> ---------------------------------
> > >> Thanks & Regards
> > >> Himanish
> > >>
> > >
> > >
> > >
> > > --
> > > Thanks & Regards
> > > Himanish
> > >
> >
>
>
>
> --
> Thanks & Regards
> Himanish
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message