hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tatsuya Kawano <tatsuy...@snowcocoa.info>
Subject Re: HBase 0.20.1 Distributed Install Problems
Date Wed, 11 Nov 2009 06:06:12 GMT
Hi Jeff,

So you've got more region servers running than you configured in
conf/regionservers? sha-cs-02 is missing and there are four region
servers named localhost:60020? It's very strange.

Before you do anything, you should clean this up; stop all running
region servers including localhost:60020 ones, otherwise anyone can
figure out what's really going on your servers. I don't think
"stop-all.sh" can do this, so you should find the servers who are
running the region servers including the extra ones, and manually stop
them.

So, ssh to one of your servers, then type:
${HBASE_HOME}/bin/hbase-daemon.sh stop regionserver

Do this on sha-cs-01, sha-cs-02, sha-cs-03, sha-cs-05, and sha-cs-06.
Then "status 'simple'" again from hbase shell to see how many servers
left. Maybe you have those extra region servers on you and your
coworkers' PCs (?), so you'd better to run the above command from all
the machines you have installed hbase.

Once "status 'simple'" shows no region server, run "stop-hbase.sh" to
stop HBase master and ZooKeepers. Then try "start-hbase.sh" again to
see if it works.

-- 
Tatsuya Kawano (Mr.)
Tokyo, Japan




On Wed, Nov 11, 2009 at 2:29 PM, Jean-Daniel Cryans <jdcryans@apache.org> wrote:
> Check your OS networking configuration, make sure stuff don't resolves
> to localhost or 127.0.0.1 or 127.0.1.1
>
> Also you said you can't run the list, what does it do then?
>
> J-D
>
> On Tue, Nov 10, 2009 at 9:23 PM, Jeff Zhang <zjffdu@gmail.com> wrote:
>> *I configure the regionservers in the file regsionservers as following:*
>>
>> sha-cs-01
>> sha-cs-02
>> sha-cs-03
>> sha-cs-05
>> sha-cs-06
>>
>> *And also I configure the zookeeper in file hbase-site.xml as following:*
>>
>> <configuration>
>>  <property>
>>    <name>hbase.cluster.distributed</name>
>>    <value>true</value>
>>    <description>The mode the cluster will be in. Possible values are
>>      false: standalone and pseudo-distributed setups with managed Zookeeper
>>      true: fully-distributed with unmanaged Zookeeper Quorum (see
>> hbase-env.sh)
>>    </description>
>>  </property>
>>  <property>
>>      <name>hbase.zookeeper.property.clientPort</name>
>>      <value>2222</value>
>>      <description>Property from ZooKeeper's config zoo.cfg.
>>      The port at which the clients will connect.
>>      </description>
>>    </property>
>>  <property>
>>      <name>hbase.zookeeper.quorum</name>
>>      <value>*sha-cs-01,sha-cs-02,sha-cs-03,sha-cs-04,sha-cs-06*</value>
>>      <description>Comma separated list of servers in the ZooKeeper Quorum.
>>      For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com
>> ".
>>      By default this is set to localhost for local and pseudo-distributed
>> modes
>>      of operation. For a fully-distributed setup, this should be set to a
>> full
>>      list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in
>> hbase-env.sh
>>      this is the list of servers which we will start/stop ZooKeeper on.
>>      </description>
>>  </property>
>>  <property>
>>    <name>hbase.rootdir</name>
>>    <value>hdfs://sha-cs-04:9000/hbase</value>
>>    <description>The directory shared by region servers.
>>    </description>
>>  </property>
>>
>> </configuration>
>>
>>
>> I still do not understand what's wrong with my configuration ?
>>
>>
>> Jeff Zhang
>>
>>
>>
>> On Wed, Nov 11, 2009 at 12:56 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:
>>
>>> Please read my answer to Chris (wrote about 10-15 minutes ago), you
>>> also seem to confuse regionservers and zookeeper quorum members.
>>>
>>> In this case it also seems some region servers registered themselves
>>> as localhost and then with their good address the master probably gave
>>> them. Please check your OS network configurations and make sure the
>>> hostname points at the right place.
>>>
>>> J-D
>>>
>>> On Tue, Nov 10, 2009 at 8:47 PM, Jeff Zhang <zjffdu@gmail.com> wrote:
>>> > Hi Jean,
>>> >
>>> > I try the hbase 0.20.2, I look the logs, it seems the master the regions
>>> > works.
>>> >
>>> > But I can not run list command on hbase shell. When I invoke command
>>> status
>>> > 'simple' on hbase shell. It shows the following message:
>>> > 09/11/11 12:42:55 DEBUG client.HConnectionManager$ClientZKWatcher: Got
>>> > ZooKeeper event, state: SyncConnected, type: None, path: null
>>> > 09/11/11 12:42:55 DEBUG zookeeper.ZooKeeperWrapper: Read ZNode
>>> /hbase/master
>>> > got 10.148.224.13:60000
>>> > 8 servers, 0 dead, 0.1250 average load
>>> > hbase(main):002:0> status 'simple'
>>> > 8 live servers
>>> >    localhost:60020 1257914319445
>>> >        requests=0, regions=0, usedHeap=0, maxHeap=0
>>> >    sha-cs-03:60020 1257914321331
>>> >        requests=0, regions=0, usedHeap=33, maxHeap=991
>>> >    localhost:60020 1257914320265
>>> >        requests=0, regions=0, usedHeap=0, maxHeap=0
>>> >    sha-cs-01:60020 1257914320551
>>> >        requests=0, regions=1, usedHeap=34, maxHeap=991
>>> >    sha-cs-05:60020 1257914322656
>>> >        requests=0, regions=0, usedHeap=33, maxHeap=991
>>> >    sha-cs-06:60020 1257914321467
>>> >        requests=0, regions=0, usedHeap=34, maxHeap=991
>>> >    localhost:60020 1257914320202
>>> >        requests=0, regions=0, usedHeap=0, maxHeap=0
>>> >    localhost:60020 1257914321532
>>> >        requests=0, regions=0, usedHeap=0, maxHeap=0
>>> >
>>> >
>>> > It's weired that why here I have 3 localhost zookeeper, actually I set 5
>>> > machines on hbase.zookeeper.quorum
>>> >
>>> >
>>> >
>>> > Jeff Zhang
>>> >
>>> >
>>> >
>>> >
>>> > On Wed, Nov 11, 2009 at 9:47 AM, Jean-Daniel Cryans <jdcryans@apache.org
>>> >wrote:
>>> >
>>> >> This particular problem is fixed in the current 0.20 branch and we
>>> >> just released a candidate for 0.20.2, you can get it here
>>> >> http://people.apache.org/~jdcryans/hbase-0.20.2-candidate-1/<http://people.apache.org/%7Ejdcryans/hbase-0.20.2-candidate-1/>
>>> <http://people.apache.org/%7Ejdcryans/hbase-0.20.2-candidate-1/>
>>> >>
>>> >> J-D
>>> >>
>>> >> On Tue, Nov 10, 2009 at 5:43 PM, Jeff Zhang <zjffdu@gmail.com>
wrote:
>>> >> > The following is the region server's log :
>>> >> >
>>> >> >
>>> >> > 2009-11-10 18:09:08,062 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 3 on 60020: starting
>>> >> > 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 4 on 60020: starting
>>> >> > 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 5 on 60020: starting
>>> >> > 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 6 on 60020: starting
>>> >> > 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 7 on 60020: starting
>>> >> > 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 8 on 60020: starting
>>> >> > 2009-11-10 18:09:08,063 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: HRegionServer
>>> started
>>> >> > at: 10.148.224.11:60020
>>> >> > 2009-11-10 18:09:08,064 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 9 on 60020: starting
>>> >> > 2009-11-10 18:09:08,070 INFO
>>> >> org.apache.hadoop.hbase.regionserver.StoreFile:
>>> >> > Allocating LruBlockCache with maximum size 198.3m
>>> >> > 2009-11-10 18:09:08,095 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.HRegionServer:
>>> >> MSG_CALL_SERVER_STARTUP
>>> >> > 2009-11-10 18:09:08,229 INFO
>>> org.apache.hadoop.hbase.regionserver.HLog:
>>> >> HLog
>>> >> > configuration: blocksize=67108864, rollsize=63753420, enabled=true,
>>> >> > flushlogentries=100, optionallogflushinternal=10000ms
>>> >> > 2009-11-10 18:09:08,253 INFO
>>> org.apache.hadoop.hbase.regionserver.HLog:
>>> >> New
>>> >> > hlog /hbase/.logs/10.148.224.11
>>> >> ,60020,1257847748205/hlog.dat.1257847748229
>>> >> > 2009-11-10 18:09:08,255 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master
at
>>> >> > 10.148.224.13:60000 that we are up
>>> >> > 2009-11-10 18:09:08,302 FATAL
>>> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled
>>> exception.
>>> >> > Aborting...
>>> >> > java.lang.NullPointerException
>>> >> >        at
>>> >> >
>>> >>
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:459)
>>> >> >        at java.lang.Thread.run(Thread.java:619)
>>> >> > 2009-11-10 18:09:08,304 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
>>> >> > request=0.0, regions=0, stores=0, storefiles=0, storefileIndexSize=0,
>>> >> > memstoreSize=0, usedHeap=31, maxHeap=99
>>> >> > 1, blockCacheSize=1707288, blockCacheFree=206264664,
>>> blockCacheCount=0,
>>> >> > blockCacheHitRatio=0
>>> >> > 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer:
>>> Stopping
>>> >> > server on 60020
>>> >> > 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 0 on 60020: exiting
>>> >> > 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer:
>>> Stopping
>>> >> IPC
>>> >> > Server listener on 60020
>>> >> > 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 1 on 60020: exiting
>>> >> > 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 2 on 60020: exiting
>>> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 3 on 60020: exiting
>>> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 4 on 60020: exiting
>>> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 5 on 60020: exiting
>>> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 6 on 60020: exiting
>>> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 7 on 60020: exiting
>>> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 8 on 60020: exiting
>>> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer:
IPC
>>> >> Server
>>> >> > handler 9 on 60020: exiting
>>> >> > 2009-11-10 18:09:08,306 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping
>>> infoServer
>>> >> > 2009-11-10 18:09:08,307 INFO org.apache.hadoop.ipc.HBaseServer:
>>> Stopping
>>> >> IPC
>>> >> > Server Responder
>>> >> > 2009-11-10 18:09:08,412 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher:
>>> >> > regionserver/127.0.0.1:60020.cacheFlusher exiting
>>> >> > 2009-11-10 18:09:08,412 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.LogFlusher:
>>> >> > regionserver/127.0.0.1:60020.logFlusher exiting
>>> >> > 2009-11-10 18:09:08,412 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.CompactSplitThread:
>>> >> > regionserver/127.0.0.1:60020.compactor exiting
>>> >> > 2009-11-10 18:09:08,412 INFO
>>> >> org.apache.hadoop.hbase.regionserver.LogRoller:
>>> >> > LogRoller exiting.
>>> >> > 2009-11-10 18:09:08,413 INFO
>>> >> >
>>> >>
>>> org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker:
>>> >> > regionserver/127.0.0.1:60020.majorCompactionChecker exiting
>>> >> > 2009-11-10 18:09:08,427 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: On abort, closed
>>> hlog
>>> >> > 2009-11-10 18:09:08,428 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server
>>> at:
>>> >> > 10.148.224.11:60020
>>> >> > 2009-11-10 18:09:17,489 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread
>>> exiting
>>> >> > 2009-11-10 18:09:17,489 INFO org.apache.zookeeper.ZooKeeper: Closing
>>> >> > session: 0x324dcceb05c0003
>>> >> > 2009-11-10 18:09:17,490 INFO org.apache.zookeeper.ClientCnxn: Closing
>>> >> > ClientCnxn for session: 0x324dcceb05c0003
>>> >> > 2009-11-10 18:09:17,495 INFO org.apache.hadoop.hbase.Leases:
>>> >> > regionserver/127.0.0.1:60020.leaseChecker closing leases
>>> >> > 2009-11-10 18:09:17,495 INFO org.apache.hadoop.hbase.Leases:
>>> >> > regionserver/127.0.0.1:60020.leaseChecker closed leases
>>> >> > 2009-11-10 18:09:17,500 INFO org.apache.zookeeper.ClientCnxn:
>>> Exception
>>> >> > while closing send thread for session 0x324dcceb05c0003 : Read
error
>>> rc =
>>> >> -1
>>> >> > java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
>>> >> > 2009-11-10 18:09:17,604 INFO org.apache.zookeeper.ClientCnxn:
>>> >> Disconnecting
>>> >> > ClientCnxn for session: 0x324dcceb05c0003
>>> >> > 2009-11-10 18:09:17,604 INFO org.apache.zookeeper.ZooKeeper: Session:
>>> >> > 0x324dcceb05c0003 closed
>>> >> > 2009-11-10 18:09:17,605 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver/
>>> >> > 127.0.0.1:60020 exiting
>>> >> > 2009-11-10 18:09:17,605 INFO org.apache.zookeeper.ClientCnxn:
>>> EventThread
>>> >> > shut down
>>> >> > 2009-11-10 18:09:17,606 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown
>>> >> > thread.
>>> >> > 2009-11-10 18:09:17,606 INFO
>>> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread
>>> >> complete
>>> >> >
>>> >> > On Tue, Nov 10, 2009 at 10:55 PM, Andrew Purtell <apurtell@apache.org
>>> >> >wrote:
>>> >> >
>>> >> >> When you try to start the region servers, what do you see in
the log?
>>> >> >>
>>> >> >> If you don't change the client port
>>> >> (hbase.zookeeper.property.clientPort),
>>> >> >> does it work?
>>> >> >>
>>> >> >>     - Andy
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> ________________________________
>>> >> >> From: Jeff Zhang <zjffdu@gmail.com>
>>> >> >> To: hbase-user@hadoop.apache.org
>>> >> >> Sent: Tue, November 10, 2009 2:40:28 PM
>>> >> >> Subject: Re: HBase 0.20.1 Distributed Install Problems
>>> >> >>
>>> >> >> Hi,
>>> >> >>
>>> >> >> I meet the same problem that I can not start the regionserver.
>>> >> >>
>>> >> >> When I invoke zk_dump
>>> >> >>
>>> >> >> it shows:
>>> >> >>
>>> >> >> HBase tree in ZooKeeper is rooted at /hbase
>>> >> >>  Cluster up? true
>>> >> >>  In safe mode? true
>>> >> >>  Master address: 10.148.224.13:60000
>>> >> >>  Region server holding ROOT: null
>>> >> >>  Region servers:
>>> >> >>
>>> >> >>
>>> >> >> The following is my hbase-site.xml
>>> >> >>
>>> >> >> <configuration>
>>> >> >>  <property>
>>> >> >>    <name>hbase.cluster.distributed</name>
>>> >> >>    <value>true</value>
>>> >> >>    <description>The mode the cluster will be in. Possible
values are
>>> >> >>      false: standalone and pseudo-distributed setups with
managed
>>> >> Zookeeper
>>> >> >>      true: fully-distributed with unmanaged Zookeeper Quorum
(see
>>> >> >> hbase-env.sh)
>>> >> >>    </description>
>>> >> >>  </property>
>>> >> >>  <property>
>>> >> >>    <name>hbase.rootdir</name>
>>> >> >>    <value>hdfs://sha-cs-04:9000/hbase</value>
>>> >> >>    <description>The directory shared by region servers.
>>> >> >>    </description>
>>> >> >>  </property>
>>> >> >>  <property>
>>> >> >>      <name>hbase.zookeeper.property.clientPort</name>
>>> >> >>      <value>2222</value>
>>> >> >>      <description>Property from ZooKeeper's config
zoo.cfg.
>>> >> >>      The port at which the clients will connect.
>>> >> >>      </description>
>>> >> >>   </property>
>>> >> >>   <property>
>>> >> >>      <name>hbase.zookeeper.quorum</name>
>>> >> >>      <value>sha-cs-01,sha-cs-02,sha-cs-03,sha-cs-05,sha-cs-06</value>
>>> >> >>      <description>Comma separated list of servers
in the ZooKeeper
>>> >> Quorum.
>>> >> >>      For example, "host1.mydomain.com,host2.mydomain.com,
>>> >> >> host3.mydomain.com
>>> >> >> ".
>>> >> >>      By default this is set to localhost for local and
>>> >> pseudo-distributed
>>> >> >> modes
>>> >> >>      of operation. For a fully-distributed setup, this should
be set
>>> to
>>> >> a
>>> >> >> full
>>> >> >>      list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK
is set in
>>> >> >> hbase-env.sh
>>> >> >>      this is the list of servers which we will start/stop
ZooKeeper
>>> on.
>>> >> >>      </description>
>>> >> >>    </property>
>>> >> >>
>>> >> >> </configuration>
>>> >> >>
>>> >> >> What's wrong with my configuration ?
>>> >> >>
>>> >> >>
>>> >> >> Thank you in advance.
>>> >> >>
>>> >> >>
>>> >> >> Jeff Zhang
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> On Tue, Nov 10, 2009 at 12:47 PM, Tatsuya Kawano
>>> >> >> <tatsuyaml@snowcocoa.info>wrote:
>>> >> >>
>>> >> >> > Hello,
>>> >> >> >
>>> >> >> > It looks like the master and the region servers are cannot
locate
>>> each
>>> >> >> > other. HBase 0.20.x uses ZooKeeper (zk) to locate other
cluster
>>> >> >> > members, so maybe your zk has wrong information.
>>> >> >> >
>>> >> >> > Can you type zk_dump from hbase shell and let us the result?
>>> >> >> >
>>> >> >> > If the cluster is properly configured, you'll get something
like
>>> this:
>>> >> >> > =====================================
>>> >> >> > hbase(main):007:0> zk_dump
>>> >> >> >
>>> >> >> > HBase tree in ZooKeeper is rooted at /hbase
>>> >> >> >  Cluster up? true
>>> >> >> >  In safe mode? false
>>> >> >> >  Master address: 172.16.80.26:60000
>>> >> >> >  Region server holding ROOT: 172.16.80.27:60020
>>> >> >> >  Region servers:
>>> >> >> >   - 172.16.80.27:60020
>>> >> >> >   - 172.16.80.29:60020
>>> >> >> >   - 172.16.80.28:60020
>>> >> >> > =====================================
>>> >> >> >
>>> >> >> >
>>> >> >> > > one of my co-workers apparently can log into his
box and submit
>>> >> jobs,
>>> >> >> but
>>> >> >> > > me or anyone else is still unable to log in.
>>> >> >> >
>>> >> >> > Maybe you're a bit confused; your co-worker seems to be
able to use
>>> >> >> > Hadoop Map/Reduce, not HBase.
>>> >> >> >
>>> >> >> >
>>> >> >> > > Does Hbase allow concurrent connections?
>>> >> >> >
>>> >> >> > Yes.
>>> >> >> >
>>> >> >> >
>>> >> >> > >> I think it also says the master is on port 60000
>>> >> >> > >> when the install directions say its supposed
to be 60010?
>>> >> >> >
>>> >> >> > Port 60000 is correct. The master uses port 60000 to accept
>>> connection
>>> >> >> > from hbase shell and region servers. Port 60010 is for
the
>>> web-based
>>> >> >> > HBase console.
>>> >> >> >
>>> >> >> >
>>> >> >> > > We tried applying this fix (to explicitly set the
master):
>>> >> >> > >
>>> http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html
>>> >> >> >
>>> >> >> > No, this is an old way to configure a cluster. You shouldn't
use
>>> this
>>> >> >> > with HBase 0.20.x
>>> >> >> >
>>> >> >> >
>>> >> >> > Thanks,
>>> >> >> >
>>> >> >> > --
>>> >> >> > Tatsuya Kawano (Mr.)
>>> >> >> > Tokyo, Japan
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> > On Tue, Nov 10, 2009 at 1:10 PM, Chris Bates
>>> >> >> > <christopher.andrew.bates@gmail.com> wrote:
>>> >> >> > > Another interesting data point.  We tried applying
this fix (to
>>> >> >> > explicitly
>>> >> >> > > set the master):
>>> >> >> > >
>>> http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html
>>> >> >> > >
>>> >> >> > > But when I log in to the master node, it takes really
long to
>>> submit
>>> >> a
>>> >> >> > query
>>> >> >> > > and I get this in response:
>>> >> >> > > hbase(main):001:0> list
>>> >> >> > > NativeException:
>>> >> >> > org.apache.hadoop.hbase.client.RetriesExhaustedException:
>>> >> >> > > Trying to contact region server null for region ,
row '', but
>>> failed
>>> >> >> > after 5
>>> >> >> > > attempts.
>>> >> >> > > Exceptions:
>>> >> >> > > org.apache.hadoop.hbase.client.NoServerForRegionException:
Timed
>>> out
>>> >> >> > trying
>>> >> >> > > to locate root region
>>> >> >> > > org.apache.hadoop.hbase.client.NoServerForRegionException:
Timed
>>> out
>>> >> >> > trying
>>> >> >> > > to locate root region
>>> >> >> > > org.apache.hadoop.hbase.client.NoServerForRegionException:
Timed
>>> out
>>> >> >> > trying
>>> >> >> > > to locate root region
>>> >> >> > > org.apache.hadoop.hbase.client.NoServerForRegionException:
Timed
>>> out
>>> >> >> > trying
>>> >> >> > > to locate root region
>>> >> >> > > org.apache.hadoop.hbase.client.NoServerForRegionException:
Timed
>>> out
>>> >> >> > trying
>>> >> >> > > to locate root region
>>> >> >> > >
>>> >> >> > > from
>>> org/apache/hadoop/hbase/client/HConnectionManager.java:1001:in
>>> >> >> > > `getRegionServerWithRetries'
>>> >> >> > >  from org/apache/hadoop/hbase/client/MetaScanner.java:55:in
>>> >> `metaScan'
>>> >> >> > > from org/apache/hadoop/hbase/client/MetaScanner.java:28:in
>>> >> `metaScan'
>>> >> >> > >  from
>>> org/apache/hadoop/hbase/client/HConnectionManager.java:432:in
>>> >> >> > > `listTables'
>>> >> >> > > from org/apache/hadoop/hbase/client/HBaseAdmin.java:127:in
>>> >> `listTables'
>>> >> >> > >  from sun/reflect/NativeMethodAccessorImpl.java:-2:in
`invoke0'
>>> >> >> > > from sun/reflect/NativeMethodAccessorImpl.java:39:in
`invoke'
>>> >> >> > >  from sun/reflect/DelegatingMethodAccessorImpl.java:25:in
>>> `invoke'
>>> >> >> > > from java/lang/reflect/Method.java:597:in `invoke'
>>> >> >> > >  from org/jruby/javasupport/JavaMethod.java:298:in
>>> >> >> > > `invokeWithExceptionHandling'
>>> >> >> > > from org/jruby/javasupport/JavaMethod.java:259:in
`invoke'
>>> >> >> > >  from org/jruby/java/invokers/InstanceMethodInvoker.java:36:in
>>> >> `call'
>>> >> >> > > from org/jruby/runtime/callsite/CachingCallSite.java:253:in
>>> >> >> > `cacheAndCall'
>>> >> >> > >  from org/jruby/runtime/callsite/CachingCallSite.java:72:in
>>> `call'
>>> >> >> > > from org/jruby/ast/CallNoArgNode.java:61:in `interpret'
>>> >> >> > >  from org/jruby/ast/ForNode.java:104:in `interpret'
>>> >> >> > > ... 116 levels...
>>> >> >> > > from
>>> >> >> > >
>>> >> >>
>>> >>
>>> opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb#start:-1:in
>>> >> >> > > `call'
>>> >> >> > >  from
>>> org/jruby/internal/runtime/methods/DynamicMethod.java:226:in
>>> >> >> `call'
>>> >> >> > > from
>>> org/jruby/internal/runtime/methods/CompiledMethod.java:211:in
>>> >> >> `call'
>>> >> >> > >  from
>>> org/jruby/internal/runtime/methods/CompiledMethod.java:71:in
>>> >> >> `call'
>>> >> >> > > from org/jruby/runtime/callsite/CachingCallSite.java:253:in
>>> >> >> > `cacheAndCall'
>>> >> >> > >  from org/jruby/runtime/callsite/CachingCallSite.java:72:in
>>> `call'
>>> >> >> > > from
>>> >> >> >
>>> >> opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:497:in
>>> >> >> > > `__file__'
>>> >> >> > >  from
>>> >> >> >
>>> opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:-1:in
>>> >> >> > > `load'
>>> >> >> > > from org/jruby/Ruby.java:577:in `runScript'
>>> >> >> > >  from org/jruby/Ruby.java:480:in `runNormally'
>>> >> >> > > from org/jruby/Ruby.java:354:in `runFromMain'
>>> >> >> > >  from org/jruby/Main.java:229:in `run'
>>> >> >> > > from org/jruby/Main.java:110:in `run'
>>> >> >> > >  from org/jruby/Main.java:94:in `main'
>>> >> >> > > from /opt/hadoop/hbase-0.20.1/bin/../bin/hirb.rb:338:in
`list'
>>> >> >> > >  from (hbase):2hbase(main):002:0>
>>> >> >> > >
>>> >> >> > >
>>> >> >> > > On Mon, Nov 9, 2009 at 10:52 PM, Chris Bates <
>>> >> >> > > christopher.andrew.bates@gmail.com> wrote:
>>> >> >> > >
>>> >> >> > >> thanks for your response Sujee.  These boxes
are all on an
>>> internal
>>> >> >> DNS
>>> >> >> > and
>>> >> >> > >> they all resolve.
>>> >> >> > >>
>>> >> >> > >> one of my co-workers apparently can log into
his box and submit
>>> >> jobs,
>>> >> >> > but
>>> >> >> > >> me or anyone else is still unable to log in.
 Does Hbase allow
>>> >> >> > concurrent
>>> >> >> > >> connections?  In Hive I remember having to configure
the
>>> metastore
>>> >> to
>>> >> >> be
>>> >> >> > in
>>> >> >> > >> server mode if multiple people were using it.
>>> >> >> > >>
>>> >> >> > >>
>>> >> >> > >> On Mon, Nov 9, 2009 at 10:13 PM, Sujee Maniyam
<sujee@sujee.net
>>> >
>>> >> >> wrote:
>>> >> >> > >>
>>> >> >> > >>> > [hadoop@crunch hbase-0.20.1]$ bin/start-hbase.sh
>>> >> >> > >>> >
>>> >> >> > >>> > crunch2: Warning: Permanently added
'crunch2' (RSA) to the
>>> list
>>> >> of
>>> >> >> > known
>>> >> >> > >>> > hosts.
>>> >> >> > >>>
>>> >> >> > >>>
>>> >> >> > >>> is your SSH setup correctly?  From master,
you need to be able
>>> to
>>> >> >> > >>> login to all slaves/regionservers without
password
>>> >> >> > >>>
>>> >> >> > >>> And I see you are using short hostnames (crunch2,
crunch3), do
>>> >> they
>>> >> >> > >>> all resolve correctly?  or you need to update
/etc/hosts to
>>> >> resolve
>>> >> >> > >>> these to an IP address on all machines.
>>> >> >> > >>>
>>> >> >> > >>> regards
>>> >> >> > >>> Sujee Maniyam
>>> >> >> > >>> --
>>> >> >> > >>> http://sujee.net

Mime
View raw message