hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ananth T. Sarathy" <ananth.t.sara...@gmail.com>
Subject Re: hbase on s3 and safemode
Date Wed, 07 Oct 2009 19:34:28 GMT
ok. so we finally got the regionserver to come up (We killed all the
processes on the box and finally the regionserver came back up)
but when it did, there is no data in our tables. Though the tables are
there.  Any ideas where the data went or how I can get it back?

Ananth T Sarathy


On Wed, Oct 7, 2009 at 2:46 PM, Andrew Purtell <apurtell@apache.org> wrote:

> One option is to add SYSV init scripts that on boot take the following
> equivalent actions:
>
>    hbase-daemon.sh start zookeeper
>
>    hbase-daemon.sh start master
>
>    hbase-daemon.sh start regionserver
>
> Set the respective init scripts to run according to host role.
>
> This presumes you have also added init scripts that start up DFS daemons
> wherever they should be, equivalents to the following:
>
>    hadoop-daemon.sh start namenode
>
>    hadoop-daemon.sh start datanode
>
>    hadoop-daemon.sh start secondarynamenode
>
> You can start everything up all at once. The respective daemons will wait
> for each others' services to become available. Ignore ZK noise in the logs
> about connection difficulties unless they persist for minutes.
>
> If you want to try out the Cloudera Hadoop distribution for 0.20, they have
> RPMs that will take care of all of this for you, and we have a RPM for that
> platform that I can provide you.
>
> Do also check your network configuration.
>
>   - Andy
>
>
>
>
> ________________________________
> From: Ananth T. Sarathy <ananth.t.sarathy@gmail.com>
> To: hbase-user@hadoop.apache.org
> Sent: Wed, October 7, 2009 11:36:22 AM
> Subject: Re: hbase on s3 and safemode
>
> is there a way to turn my regionservers on implicitly besides
> start-hbase.sh?
> Ananth T Sarathy
>
>
> On Wed, Oct 7, 2009 at 2:31 PM, Andrew Purtell <apurtell@apache.org>
> wrote:
>
> > HBase won't leave safe mode if the regionservers cannot contact the
> master.
> > So the question is why cannot your regionservers contact the master. If
> the
> > regionserver processes are confirmed running, then it's a firewall or AWS
> > Security Groups config problem most likely.
> >
> > status was a shell command added in 0.20 IIRC.
> >
> >    - Andy
> >
> >
> >
> >
> > ________________________________
> > From: Ananth T. Sarathy <ananth.t.sarathy@gmail.com>
> > To: hbase-user@hadoop.apache.org
> > Sent: Wed, October 7, 2009 11:04:03 AM
> > Subject: Re: hbase on s3 and safemode
> >
> > i suppose we need to, but for now it's kind of a pain because we need to
> > coordinate our clients.
> >
> > But the problem is why was it working and all of the sudden it's stuck in
> > safemode and how to can get back up?
> >
> > Ananth T Sarathy
> >
> >
> > On Wed, Oct 7, 2009 at 1:58 PM, stack <stack@duboce.net> wrote:
> >
> > > Can you update to 0.20.0? (Oodles of improvements).
> > > St.Ack
> > >
> > > On Wed, Oct 7, 2009 at 10:56 AM, Ananth T. Sarathy <
> > > ananth.t.sarathy@gmail.com> wrote:
> > >
> > > > I get an error
> > > >
> > > > hbase(main):001:0> status "detailed"
> > > > NoMethodError: undefined method `status' for #<Object:0x5585c0de>
> > > >        from (hbase):2
> > > > hbase(main):002:0> status "detailed"
> > > > NoMethodError: undefined method `status' for #<Object:0x5585c0de>
> > > >        from (hbase):3
> > > >
> > > >
> > > > we are running 0.19.3
> > > >
> > > > Ananth T Sarathy
> > > >
> > > >
> > > > On Wed, Oct 7, 2009 at 1:51 PM, stack <stack@duboce.net> wrote:
> > > >
> > > > > This state persists even if you shutdown hbase and zk and restart?
> > > > >
> > > > > In shell, do:
> > > > >
> > > > > > status "detailed"
> > > > >
> > > > > At the top there is a section which says regions in transistion.
> > > >  Anything
> > > > > there?
> > > > >
> > > > > St.Ack
> > > > >
> > > > >
> > > > > On Wed, Oct 7, 2009 at 10:35 AM, Ananth T. Sarathy <
> > > > > ananth.t.sarathy@gmail.com> wrote:
> > > > >
> > > > > > Here is the log  since I started it...
> > > > > >
> > > > > > Wed Oct  7 13:27:26 EDT 2009 Starting master on ip-10-244-9-171
> > > > > > ulimit -n 1024
> > > > > > 2009-10-07 13:27:26,404 INFO
> > org.apache.hadoop.hbase.master.HMaster:
> > > > > > vmName=Java HotSpot(TM) 64-Bit Server VM, vmVendor=Sun
> Microsystems
> > > > Inc.,
> > > > > > vmVersion=14.2-b01
> > > > > > 2009-10-07 13:27:26,405 INFO
> > org.apache.hadoop.hbase.master.HMaster:
> > > > > > vmInputArguments=[-Xmx2000m, -XX:+HeapDumpOnOutOfMemoryError,
> > > > > > -Djava.io.tmpdir=/mnt/tmp,
> > > > > > -Dhbase.log.dir=/mnt/apps/hadoop/hbase/bin/../logs,
> > > > > > -Dhbase.log.file=hbase-root-master-ip-10-244-9-171.log,
> > > > > > -Dhbase.home.dir=/mnt/apps/hadoop/hbase/bin/..,
> > -Dhbase.id.str=root,
> > > > > > -Dhbase.root.logger=INFO,DRFA,
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -Djava.library.path=/mnt/apps/hadoop/hbase/bin/../lib/native/Linux-amd64-64]
> > > > > > 2009-10-07 13:27:27,525 INFO
> > org.apache.hadoop.hbase.master.HMaster:
> > > > Root
> > > > > > region dir: s3://
> > > hbase2.s3.amazonaws.com:80/hbasedata/-ROOT-/70236052
> > > > > > 2009-10-07<
> > > > >
> > >
> http://hbase2.s3.amazonaws.com:80/hbasedata/-ROOT-/70236052%0A2009-10-07
> > > > >13:27:27,751
> > > > > INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics:
> > > > > > Initializing RPC Metrics with hostName=HMaster, port=60000
> > > > > > 2009-10-07 13:27:27,827 INFO
> > org.apache.hadoop.hbase.master.HMaster:
> > > > > > HMaster
> > > > > > initialized on 10.244.9.171:60000
> > > > > > 2009-10-07 13:27:27,829 INFO
> > > org.apache.hadoop.metrics.jvm.JvmMetrics:
> > > > > > Initializing JVM Metrics with processName=Master,
> sessionId=HMaster
> > > > > > 2009-10-07 13:27:27,830 INFO
> > > > > > org.apache.hadoop.hbase.master.metrics.MasterMetrics: Initialized
> > > > > > 2009-10-07 13:27:27,932 INFO org.mortbay.util.Credential:
> Checking
> > > > > Resource
> > > > > > aliases
> > > > > > 2009-10-07 13:27:27,936 INFO org.mortbay.http.HttpServer: Version
> > > > > > Jetty/5.1.4
> > > > > > 2009-10-07 13:27:27,936 INFO org.mortbay.util.Container: Started
> > > > > > HttpContext[/logs,/logs]
> > > > > > 2009-10-07 13:27:28,202 INFO org.mortbay.util.Container: Started
> > > > > > org.mortbay.jetty.servlet.WebApplicationHandler@3209fa8f
> > > > > > 2009-10-07 13:27:28,244 INFO org.mortbay.util.Container: Started
> > > > > > WebApplicationContext[/static,/static]
> > > > > > 2009-10-07 13:27:28,361 INFO org.mortbay.util.Container: Started
> > > > > > org.mortbay.jetty.servlet.WebApplicationHandler@b0c0f66
> > > > > > 2009-10-07 13:27:28,364 INFO org.mortbay.util.Container: Started
> > > > > > WebApplicationContext[/,/]
> > > > > > 2009-10-07 13:27:28,636 INFO org.mortbay.util.Container: Started
> > > > > > org.mortbay.jetty.servlet.WebApplicationHandler@3c2d7440
> > > > > > 2009-10-07 13:27:28,638 INFO org.mortbay.util.Container: Started
> > > > > > WebApplicationContext[/api,rest]
> > > > > > 2009-10-07 13:27:28,639 INFO org.mortbay.http.SocketListener:
> > Started
> > > > > > SocketListener on 0.0.0.0:60010
> > > > > > 2009-10-07 13:27:28,639 INFO org.mortbay.util.Container: Started
> > > > > > org.mortbay.jetty.Server@28b301f2
> > > > > > 2009-10-07 13:27:28,640 INFO org.apache.hadoop.ipc.HBaseServer:
> IPC
> > > > > Server
> > > > > > Responder: starting
> > > > > > 2009-10-07 13:27:28,641 INFO org.apache.hadoop.ipc.HBaseServer:
> IPC
> > > > > Server
> > > > > > listener on 60000: starting
> > > > > > 2009-10-07 13:27:28,641 INFO org.apache.hadoop.ipc.HBaseServer:
> IPC
> > > > > Server
> > > > > > handler 0 on 60000: starting
> > > > > > 2009-10-07 13:27:28,641 INFO org.apache.hadoop.ipc.HBaseServer:
> IPC
> > > > > Server
> > > > > > handler 1 on 60000: starting
> > > > > > 2009-10-07 13:27:28,641 INFO org.apache.hadoop.ipc.HBaseServer:
> IPC
> > > > > Server
> > > > > > handler 2 on 60000: starting
> > > > > > 2009-10-07 13:27:28,642 INFO org.apache.hadoop.ipc.HBaseServer:
> IPC
> > > > > Server
> > > > > > handler 3 on 60000: starting
> > > > > > 2009-10-07 13:27:28,642 INFO org.apache.hadoop.ipc.HBaseServer:
> IPC
> > > > > Server
> > > > > > handler 4 on 60000: starting
> > > > > > 2009-10-07 13:27:28,642 INFO org.apache.hadoop.ipc.HBaseServer:
> IPC
> > > > > Server
> > > > > > handler 5 on 60000: starting
> > > > > > 2009-10-07 13:27:28,642 INFO org.apache.hadoop.ipc.HBaseServer:
> IPC
> > > > > Server
> > > > > > handler 6 on 60000: starting
> > > > > > 2009-10-07 13:27:28,642 INFO org.apache.hadoop.ipc.HBaseServer:
> IPC
> > > > > Server
> > > > > > handler 7 on 60000: starting
> > > > > > 2009-10-07 13:27:28,642 INFO org.apache.hadoop.ipc.HBaseServer:
> IPC
> > > > > Server
> > > > > > handler 8 on 60000: starting
> > > > > > 2009-10-07 13:27:28,642 DEBUG
> > org.apache.hadoop.hbase.master.HMaster:
> > > > > > Started service threads
> > > > > > 2009-10-07 13:27:28,643 INFO org.apache.hadoop.ipc.HBaseServer:
> IPC
> > > > > Server
> > > > > > handler 9 on 60000: starting
> > > > > > 2009-10-07 13:28:09,519 INFO
> > > > > org.apache.hadoop.hbase.master.RegionManager:
> > > > > > in safe mode
> > > > > > 2009-10-07 13:28:11,542 INFO
> > > > > org.apache.hadoop.hbase.master.RegionManager:
> > > > > > in safe mode
> > > > > > 2009-10-07 13:28:13,543 INFO
> > > > > org.apache.hadoop.hbase.master.RegionManager:
> > > > > > in safe mode
> > > > > > 2009-10-07 13:28:15,545 INFO
> > > > > org.apache.hadoop.hbase.master.RegionManager:
> > > > > > in safe mode
> > > > > > 2009-10-07 13:28:17,548 INFO
> > > > > org.apache.hadoop.hbase.master.RegionManager:
> > > > > > in safe mode
> > > > > > 2009-10-07 13:28:19,555 INFO
> > > > > org.apache.hadoop.hbase.master.RegionManager:
> > > > > > in safe mode
> > > > > > 2009-10-07 13:28:27,834 INFO
> > > > org.apache.hadoop.hbase.master.BaseScanner:
> > > > > > All
> > > > > > 0 .META. region(s) scanned
> > > > > > 2009-10-07 13:29:27,832 INFO
> > > > org.apache.hadoop.hbase.master.BaseScanner:
> > > > > > All
> > > > > > 0 .META. region(s) scanned
> > > > > > 2009-10-07 13:29:37,593 INFO
> > > > > org.apache.hadoop.hbase.master.RegionManager:
> > > > > > in safe mode
> > > > > > 2009-10-07 13:30:27,834 INFO
> > > > org.apache.hadoop.hbase.master.BaseScanner:
> > > > > > All
> > > > > > 0 .META. region(s) scanned
> > > > > > 2009-10-07 13:31:27,836 INFO
> > > > org.apache.hadoop.hbase.master.BaseScanner:
> > > > > > All
> > > > > > 0 .META. region(s) scanned
> > > > > > 2009-10-07 13:32:27,838 INFO
> > > > org.apache.hadoop.hbase.master.BaseScanner:
> > > > > > All
> > > > > > 0 .META. region(s) scanned
> > > > > > 2009-10-07 13:33:27,840 INFO
> > > > org.apache.hadoop.hbase.master.BaseScanner:
> > > > > > All
> > > > > > 0 .META. region(s) scanned
> > > > > >
> > > > > >
> > > > > > Ananth T Sarathy
> > > > > >
> > > > > >
> > > > > > On Wed, Oct 7, 2009 at 1:20 PM, stack <stack@duboce.net>
wrote:
> > > > > >
> > > > > > > Thats interesting to hear.  Keep us posted.
> > > > > > >
> > > > > > > HBase asks the filesystem if its in safe mode and if it
is, it
> > > parks
> > > > > > > itself.  Here is code from master:
> > > > > > >
> > > > > > >    if (this.fs instanceof DistributedFileSystem) {
> > > > > > >      // Make sure dfs is not in safe mode
> > > > > > >      String message = "Waiting for dfs to exit safe mode...";
> > > > > > >      while (((DistributedFileSystem) fs).setSafeMode(
> > > > > > >          FSConstants.SafeModeAction.SAFEMODE_GET)) {
> > > > > > >        LOG.info(message);
> > > > > > >        try {
> > > > > > >          Thread.sleep(this.threadWakeFrequency);
> > > > > > >        } catch (InterruptedException e) {
> > > > > > >          //continue
> > > > > > >        }
> > > > > > >      }
> > > > > > >    }
> > > > > > >
> > > > > > >
> > > > > > > Then there is hbase's notion of safemode.  It will be in
safe
> > mode
> > > > > until
> > > > > > it
> > > > > > > does initial scan of catalog tables.  The master keeps
a flag
> in
> > > > > > zookeeper
> > > > > > > while its in safemode so regionservers are aware of the
state:
> > > > > > >
> > > > > > >  public boolean inSafeMode() {
> > > > > > >    if (safeMode) {
> > > > > > >      if(isInitialMetaScanComplete() &&
> regionsInTransition.size()
> > > ==
> > > > 0
> > > > > &&
> > > > > > >         tellZooKeeperOutOfSafeMode()) {
> > > > > > >        master.connection.unsetRootRegionLocation();
> > > > > > >        safeMode = false;
> > > > > > >        LOG.info("exiting safe mode");
> > > > > > >      } else {
> > > > > > >        LOG.info("in safe mode");
> > > > > > >      }
> > > > > > >    }
> > > > > > >    return safeMode;
> > > > > > >  }
> > > > > > >
> > > > > > > Have you seen the .META. and -ROOT- deploy to regionservers?
> >  Have
> > > > you
> > > > > > seen
> > > > > > > that these regions being scanned in the master log?  (Enable
> > DEBUG
> > > if
> > > > > not
> > > > > > > already enabled).
> > > > > > >
> > > > > > > Yours,
> > > > > > > ST.Ack
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Oct 7, 2009 at 10:06 AM, Ananth T. Sarathy <
> > > > > > > ananth.t.sarathy@gmail.com> wrote:
> > > > > > >
> > > > > > > > We have been running Hbase on a s3 filesystem. It's
the hbase
> > > > > > > regionserver,
> > > > > > > > not HDFS since we are using s3.  We haven't felt like
it's
> been
> > > too
> > > > > > slow,
> > > > > > > > though the amount of data we are pushing isn't sufficiently
> > large
> > > > > > enough
> > > > > > > to
> > > > > > > > notice yet.
> > > > > > > > Ananth T Sarathy
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Oct 7, 2009 at 12:47 PM, stack <stack@duboce.net>
> > wrote:
> > > > > > > >
> > > > > > > > > HBase or HDFS is in safe mode.  My guess is that
its the
> > > latter.
> > > > > > Can
> > > > > > > > you
> > > > > > > > > figure from HDFS logs why it won't leave safe
mode?
>  Usually
> > > > > > > > > under-replication or a loss of a large swath
of the cluster
> > > will
> > > > > flip
> > > > > > > on
> > > > > > > > > the
> > > > > > > > > safe-mode switch.
> > > > > > > > >
> > > > > > > > > Are you trying to run HBASE on an S3 filesystem?
 An
> HBasista
> > > > tried
> > > > > > it
> > > > > > > in
> > > > > > > > > the past and, FYI, found it insufferably slow.
 Let us know
> > how
> > > > it
> > > > > > goes
> > > > > > > > for
> > > > > > > > > you.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > St.Ack
> > > > > > > > >
> > > > > > > > > On Wed, Oct 7, 2009 at 9:33 AM, Ananth T. Sarathy
<
> > > > > > > > > ananth.t.sarathy@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > > my  regionserver has been stuck in safemode.
What can i
> do
> > to
> > > > get
> > > > > > it
> > > > > > > > out
> > > > > > > > > > safemode?
> > > > > > > > > >
> > > > > > > > > > Ananth T Sarathy
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam?  Yahoo! Mail has the best spam protection around
> > http://mail.yahoo.com
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message