hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bradford Stephens <bradfordsteph...@gmail.com>
Subject Re: HBase Failing on Large Loads
Date Wed, 10 Jun 2009 22:32:51 GMT
Thanks so much for all the help, everyone... things are still broken,
but maybe we're getting close.

All the regionservers were dead by the time the job ended.  I see
quite a few error messages like this:

(I've put the entirety of the regionserver logs on pastebin:)
http://pastebin.com/m2e6f9283
http://pastebin.com/mf97bd57

2009-06-10 14:47:54,994 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer: unable to process
message: MSG_REGION_OPEN:
joinedcontent,1DCC1616F7C7B53B69B5536F407A64DF,1244667570521:
safeMode=false
java.lang.NullPointerException

There's also a scattering of messages like this:
2009-06-10 13:49:02,855 WARN
org.apache.hadoop.hbase.regionserver.HLog: IPC Server handler 1 on
60020 took 3267ms appending an edit to HLog; editcount=21570

aaand....

2009-06-10 14:03:27,270 INFO
org.apache.hadoop.hbase.regionserver.HLog: Closed
hdfs://dttest01:54310/hbase-0.19/log_192.168.18.49_1244659862699_60020/hlog.dat.1244667757560,
entries=100006. New log writer:
/hbase-0.19/log_192.168.18.49_1244659862699_60020/hlog.dat.1244667807249
2009-06-10 14:03:28,160 INFO org.apache.hadoop.hdfs.DFSClient:
Exception in createBlockOutputStream java.io.IOException: Bad connect
ack with firstBadLink 192.168.18.47:50010
2009-06-10 14:03:28,160 INFO org.apache.hadoop.hdfs.DFSClient:
Abandoning block blk_4831127457964871573_140781
2009-06-10 14:03:34,170 INFO org.apache.hadoop.hdfs.DFSClient:
Exception in createBlockOutputStream java.io.IOException: Could not
read from stream
2009-06-10 14:03:34,170 INFO org.apache.hadoop.hdfs.DFSClient:
Abandoning block blk_-6169186743102862627_140796
2009-06-10 14:03:34,485 INFO
org.apache.hadoop.hbase.regionserver.MemcacheFlusher: Forced flushing
of joinedcontent,1F2F64F59088A3B121CFC66F7FCBA2A9,1244667654435
because global memcache limit of 398.7m exceeded; currently 399.0m and
flushing till 249.2m

Finally, I saw this when I stopped and re-started my cluster:

2009-06-10 15:29:09,494 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(192.168.18.16:50010,
storageID=DS-486600617-192.168.18.16-50010-1241838200467,
infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Version Mismatch
        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:81)
        at java.lang.Thread.run(Thread.java:619)


On Wed, Jun 10, 2009 at 2:55 PM, Ryan Rawson<ryanobjc@gmail.com> wrote:
> That is a client exception that is a sign of problems on the
> regionserver...is it still running? What do the logs look like?
>
> On Jun 10, 2009 2:51 PM, "Bradford Stephens" <bradfordstephens@gmail.com>
> wrote:
>
> OK, I've tried all the optimizations you've suggested (still running
> with a M/R job). Still having problems like this:
>
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
> contact region server 192.168.18.15:60020 for region
> joinedcontent,242FEB3ED9BE0D8EF3856E9C4251464C,1244666594390, row
> '291DB5C7440B0A5BDB0C12501308C55B', but failed after 10 attempts.
> Exceptions:
> java.io.IOException: Call to /192.168.18.15:60020 failed on local
> exception: java.io.EOFException
> java.net.ConnectException: Call to /192.168.18.15:60020 failed on
> connection exception: java.net.ConnectException: Connection refused
> java.net.ConnectException: Call to /192.168.18.15:60020 failed on
> connection exception: java.net.ConnectException: Connection refused
> java.net.ConnectException: Call to /192.168.18.15:60020 failed on
> connection exception: java.net.ConnectException: Connection refused
> java.net.ConnectException: Call to /192.168.18.15:60020 failed on
> connection exception: java.net.ConnectException: Connection refused
> java.net.ConnectException: Call to /192.168.18.15:60020 failed on
> connection exception: java.net.ConnectException: Connection refused
> java.net.ConnectException: Call to /192.168.18.15:60020 failed on
> connection exception: java.net.ConnectException: Connection refused
> java.net.ConnectException: Call to /192.168.18.15:60020 failed on
> connection exception: java.net.ConnectException: Connection refused
> java.net.ConnectException: Call to /192.168.18.15:60020 failed on
> connection exception: java.net.ConnectException: Connection refused
> java.net.ConnectException: Call to /192.168.18.15:60020 failed on
> connection exception: java.net.ConnectException: Connection refused
>
> On Wed, Jun 10, 2009 at 12:40 AM, stack<stack@duboce.net> wrote: > On Tue,
> Jun 9, 2009 at 11:51 AM,...
>

Mime
View raw message