whirr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Helmke <ihel...@gmail.com>
Subject Problem with Amazon+Cloudera+HBase
Date Tue, 06 Sep 2011 20:36:03 GMT
Hey everyone,

I'm attempting to use the Amazon cloudera+hbase configuration script
for whirr, but I'm having some trouble getting hbase to work (the
hadoop elements seem to be working fine, and I can upload files into
HDFS). I am running what is essentially the stock
hbase-ec2-cdh.properties file with my ec2 credentials, so only 2 VMs
are involved.

When I attempt to run a hadoop job that uses HBase, I get an error
that looks like this:

11/09/06 19:44:05 INFO mapred.JobClient: Task Id :
attempt_201109061752_0009_m_000000_2, Status : FAILED
KeeperErrorCode = ConnectionLoss for /hbase
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:991)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:302)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:293)
	at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:156)
	at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:85)
	at com.lightboxtechnologies.spectrum.HBaseTables.summon(HBaseTables.java:63)
	at com.lightboxtechnologies.spectrum.FsEntryHBaseOutputFormat.getHTable(FsEntryHBaseOutputFormat.java:66)
	at com.lightboxtechnologies.spectrum.FsEntryHBaseOutputFormat.getRecordWriter(FsEntryHBaseOutputFormat.java:72)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:521)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:636)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: org.apache.hadoop.hbase.ZooKeeperConnectionException:
KeeperErrorCode = ConnectionLoss for /hbase
	at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:147)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:989)
	... 15 more
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
	at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
	at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
	at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
	... 16 more

The HBase master web site is running, and shows the region server as
it should. The HBase shell doesn't seem to work properly on either
machine. It starts, but throws errors on both machines (there is a
different error on each one, however). Is there something else I need
to set up to get HBase up and running properly with whirr?

I get this on the machine with the HBase Master:
hbase(main):001:0> list
11/09/06 20:23:58 ERROR zookeeper.ZKConfig: no valid quorum servers
found in zoo.cfg
ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: An error
is preventing HBase from connecting to ZooKeeper

And I get this error on the HBase Regionserver machine:
hbase(main):001:0> list
11/09/06 20:25:49 FATAL zookeeper.ZKConfig: The server in zoo.cfg
cannot be set to localhost in a fully-distributed setup because it
won't be reachable. See "Getting Started" for more information.
0 row(s) in 0.4390 seconds

I do see that server.0 in zoo.cfg is set to localhost in the
/etc/zookeeper/zoo.cfg file:
But it's not clear to me what this should really be set to (the
machine's internal Amazon IP? Setting it to that caused the shell to

At any rate, any help or guidance as to what I might be able to do to
get things in working order would be much appreciated.


View raw message