hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Starting HBase in fully distributed mode...
Date Mon, 14 Dec 2009 19:17:53 GMT
Development of cluster admin tools that use Zookeeper to locate all
slaves (and the master) instead of a static user generated file is an
interesting idea, especially for dynamic clusters up on EC2.

   - Andy




________________________________
From: Andrew Purtell <apurtell@apache.org>
To: hbase-user@hadoop.apache.org
Sent: Mon, December 14, 2009 10:32:44 AM
Subject: Re: Starting HBase in fully distributed mode...

Yes, 'slaves' and 'regionservers' files cannot be generated as until the
instances start their identities are unknown. Some user script which runs
subsequent to the cluster start script could handle this task, run by an
admin once all slave instances have been started by EC2 as observed (via
elasticfox or similar). I have no plans to make such a script however. 

HBase and zookeeper jars do not need to be copied as they are put on the
Hadoop classpath by hbase-ec2-init-remote.sh. 

   - Andy





________________________________
From: Something Something <mailinglists19@gmail.com>
To: hbase-user@hadoop.apache.org
Sent: Mon, December 14, 2009 9:38:11 AM
Subject: Re: Starting HBase in fully distributed mode...

Andrew,

One thing I noticed with the ec2 scripts is that they don't update 'slaves'
file for hadoop as well as 'regionservers' file for hbase.  As a result,
when I stop Hbase & Hadoop, the instances on the slaves don't stop.  Not
sure if others have experienced this.

Also, I believe this has been pointed out before, but it will be nice if
hbase jar & zookeeper jar get copied automatically to hadoop lib because
they are needed by MapReduce.

Thanks.


On Fri, Dec 11, 2009 at 10:10 AM, Andrew Purtell <apurtell@apache.org>wrote:

> > Problem is - I don't know what HBase configurations to use
> > in my MapReduce program to point to HBase on another EC2 machine.
>
> 1) Copy the hbase-site.xml from the HBase cluster master
>   /usr/local/hbase*/conf/hbase-site.xml
> and put it on the classpath on your Hadoop cluster. Make sure you
> have a hbase-default.xml on the classpath on the Hadoop cluster
> also.
>
> 2) Make sure your Hadoop cluster instances can communicate with
> the HBase zookeeper, master, and slave security groups. Typically
> this means you have to execute a number of ec2-authorize commands
> of the form:
>
>   ec2-authorize <group-1> -o <group-2> -u <account-id>
>   ec2-authorize <group-2> -o <group-1> -u <account-id>
>
> where group-1 is foreach all of your Hadoop cluster's security
> groups, and group-2 is foreach all of your HBase cluster's
> security groups. It's annoying, but you only have to do it once
> and the changes will persist in your security group ACLs.
>
>   - Andy
>
>
>
> ________________________________
> From: Something Something <mailinglists19@gmail.com>
> To: hbase-user@hadoop.apache.org
> Sent: Fri, December 11, 2009 9:50:30 AM
> Subject: Re: Starting HBase in fully distributed mode...
>
> 1)  Yes, I used the same cluster name.  Okay, let me try again tonight, but
> in any case, I was able to ssh to Master and confirm setup.
> 2)  I tried the Hadoop EC2 scripts last night.  I keep getting 'Waiting for
> instance to start' and seems like it gets stuck there.  Also, keep getting
> several message like this...
> .Required option '-K, --private-key KEY' missing (-h for usage)
>
> Seems like I haven't set *something* correctly.  Will look into this
> tonight
> as well.
>
> 3)  Not sure what you mean here.  Yes, my Hadoop machines will be on EC2 as
> well.
>
> Here's my plan for the weekend:
>
> Start Hadoop instances on 10 EC2 machines.
> Start HBase on 5 EC2 machines along with Zookeeper on 5 machines.
> Start a  MapReduce job on Hadoop (master) instance.
>
> Problem is - I don't know what HBase configurations to use in my MapReduce
> program to point to HBase on another EC2 machine.  Makes sense?
>
>
>
> On Fri, Dec 11, 2009 at 12:06 AM, Andrew Purtell <apurtell@apache.org
> >wrote:
>
> > > ./bin/hbase-ec2 login testcluster
> > >
> > > Use this to login.  I tried running this from my local machine, but
> > nothing
> > > *noteworthy* happened.
> >
> > Did you replace "testcluster" with the name you used when launching your
> > cluster, assuming they are different? The scripts address clusters by the
> > labels you give them when launching them. E.g.
> >
> >   ./bin/hbase-ec2 launch foo 3 3
> >
> > launches a cluster named "foo", and
> >
> >   ./bin/hbase-ec2 login foo
> >
> > opens a SSH shell on the master of cluster "foo".
> >
> > > Did you also create similar scripts for Hadoop?
> >
> > Hadoop has its own set of EC2 scripts. I used those as the basis for
> ours.
> > You can't use the HBase and Hadoop EC2 scripts together however.
> >
> > > Later I want to start a MapReduce job on my Hadoop machines that will
> > > access this HBase cluster.  How would I do that?
> >
> > Are your Hadoop machines up on EC2 also?
> >
> > Running mapreduce jobs on the HBase cluster itself is a work in progress.
> >
> >   - Andy
> >
> >
> >
> > ________________________________
> > From: Something Something <mailinglists19@gmail.com>
> > To: hbase-user@hadoop.apache.org
> > Sent: Thu, December 10, 2009 8:21:10 PM
> > Subject: Re: Starting HBase in fully distributed mode...
> >
> > Andy,
> >
> > Thanks for the tips.  It's all working now.  I was using a different
> > KeyPair
> > for EC2_ROOT_SSH_KEY.  Once I changed this to use the root.pem it started
> > working.  I was able to ssh to the 'master' instance and get into hbase
> > shell etc.  This script is VERY helpful!  Thank you so much.
> >
> > A few questions...
> >
> > 1)  The README.txt file says this..
> >
> > ./bin/hbase-ec2 login testcluster
> >
> > Use this to login.  I tried running this from my local machine, but
> nothing
> > *noteworthy* happened.  I wasn't able to get into the hbase shell from my
> > local machine.  Anyway, this is not a big deal for me.
> >
> > 2)  Did you also create similar scripts for Hadoop?  (I guess I will look
> > into the trunk!).
> >
> > 3)  Say I use your script to start HBase on a few machines, and start
> > Hadoop
> > on some other machines.  Later I want to start a MapReduce job on my
> Hadoop
> > machines that will access this HBase cluster.  How would I do that?  What
> > HBase configurations can I use?  So far my Mapreduce job always accesses
> > HBase on the same machine.
> >
> > Thanks once again for your help.
> >
> >
> >
> > On Thu, Dec 10, 2009 at 5:30 PM, Vaibhav Puranik <vpuranik@gmail.com>
> > wrote:
> >
> > > We have HBase running on EC2 with starting Zookeeper within HBase. We
> > have
> > > it up since July 2009. No problems so far on Zookeeper front.
> > >
> > > Regards,
> > > Vaibhav Puranik
> > > Gumgum
> > >
> > > On Thu, Dec 10, 2009 at 8:12 AM, Something Something <
> > > mailinglists19@gmail.com> wrote:
> > >
> > > > Finally, I was able to get HBase running on EC2 in fully distributed
> > > mode.
> > > >  I started ZooKeeper quorum myself and pointed HBase to it.  I was
> able
> > > to
> > > > create tables using HBase shell, ran a Mapreduce job that writes to
> > these
> > > > tables, and run queries against these tables.  I used HBase shell
> from
> > > all
> > > > 3
> > > > machines, and they all see the same data confirming that the
> instances
> > > are
> > > > indeed working together.
> > > >
> > > > It seems like under EC2, starting ZooKeeper within HBase doesn't
> work,
> > > but
> > > > I
> > > > could be wrong.
> > > >
> > > > In any case, Andrew, I would like to get your scripts working in my
> > > > environment because without your scripts I don't know how I would
> grow
> > my
> > > > cluster from 3 instances to say, 30 :)
> > > >
> > > > Thank you so much everyone for your help and for sticking with me.
> > > >
> > > >
> > > > On Wed, Dec 9, 2009 at 8:25 PM, Something Something <
> > > > mailinglists19@gmail.com> wrote:
> > > >
> > > > > When I run:
> > > > >
> > > > > hbase-ec2 launch-cluster testcluster 3 3
> > > > >
> > > > > I keep getting 'lost connection' messages (See below).  Tried this
> 4
> > > > > times.  Please help.  Thanks.
> > > > >
> > > > >
> > > > > -------------------------------------------------------------
> > > > >
> > > > > Creating/checking security groups
> > > > > Security group testcluster-master exists, ok
> > > > > Security group testcluster exists, ok
> > > > > Security group testcluster-zookeeper exists, ok
> > > > > Starting ZooKeeper quorum ensemble.
> > > > > Starting an AMI with ID ami-b0cb29d9 (arch i386) in group
> > > > > testcluster-zookeeper
> > > > > Waiting for instance i-9db6f4f5 to start: ..................
> Started
> > > > > ZooKeeper instance i-9db6f4f5 as
> > > > domU-12-31-38-01-7D-D1.compute-1.internal
> > > > >     Public DNS name is ec2-174-129-148-5.compute-1.amazonaws.com.
> > > > > Starting an AMI with ID ami-b0cb29d9 (arch i386) in group
> > > > > testcluster-zookeeper
> > > > > Waiting for instance i-2db7f545 to start: ................. Started
> > > > > ZooKeeper instance i-2db7f545 as
> > > > domU-12-31-38-01-7D-43.compute-1.internal
> > > > >     Public DNS name is ec2-174-129-157-122.compute-1.amazonaws.com
> .
> > > > > Starting an AMI with ID ami-b0cb29d9 (arch i386) in group
> > > > > testcluster-zookeeper
> > > > > Waiting for instance i-afb7f5c7 to start: ......................
> > > Started
> > > > > ZooKeeper instance i-afb7f5c7 as
> > > > domU-12-31-38-01-78-F3.compute-1.internal
> > > > >     Public DNS name is ec2-174-129-179-14.compute-1.amazonaws.com.
> > > > > ZooKeeper quorum is
> > > > >
> > > >
> > >
> >
> domU-12-31-38-01-7D-D1.compute-1.internal,domU-12-31-38-01-7D-43.compute-1.internal,domU-12-31-38-01-78-F3.compute-1.internal.
> > > > > Initializing the ZooKeeper quorum ensemble.
> > > > >     ec2-174-129-148-5.compute-1.amazonaws.com
> > > > > lost connection
> > > > >     ec2-174-129-157-122.compute-1.amazonaws.com
> > > > > lost connection
> > > > >     ec2-174-129-179-14.compute-1.amazonaws.com
> > > > > lost connection
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Dec 9, 2009 at 12:46 AM, Seth Ladd <sethladd@gmail.com>
> > wrote:
> > > > >
> > > > >> > Sounds like others have used Andrew's script successfully.
 The
> > only
> > > > >> > difference seems to be that it starts a *dedicated* ZooKeeper
> > > quorum.
> > > > >> > Should have listened to Mark when he suggested that 4 days
ago
> :)
> > > > >> >
> > > > >> > Anyway, I will try Andrew's script tomorrow.
> > > > >>
> > > > >> I can vouch that the scripts in svn trunk work.  Thanks to Andrew
> > for
> > > > >> his help!  I was able to start a 3 node Zookeeper and 5 node
HBase
> > > > >> cluster on EC2 from just the scripts.
> > > > >>
> > > > >> Seth
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> >
> >
>
>
>
>
>


      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message