hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Unable to contact Regionserver in spite of META entry...
Date Thu, 29 Jul 2010 21:20:03 GMT
Have you verified that Some server is indeed the same as 63.250.207.87 ?

On Thu, Jul 29, 2010 at 12:31 PM, Vidhyashankar Venkataraman <
vidhyash@yahoo-inc.com> wrote:

> I have an MR job that sends streams of updates (puts and deletes) to an
> existing db and all the tasks are crashing complaining of the exceptions
> similar to the following:
>
>
>
>   Exception in thread "main"
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
> region server Some server, retryOnlyOne=true, index=0, islastrow=false,
> tries=9, numtries=10, i=78, listsize=390,
> region=DocData,0000001013071992,1279835733117 for region
> DocData,0000001013071992,1279835733117, row '0000001013115520', but failed
> after 10 attempts.
>
>
>
> I ran this job on 180 nodes with a max of 6 tasks per node; I thought this
> was possibly due to overload so I ran it with just 2 tasks per node but
> again got similar exceptions..
>
> Then I tried issuing a put on the hbase shell: And it complained of the
> same issue..
>
> I checked the meta table entry and it seems fine.. I checked the
> corresponding region server (web ui) and it is indeed hosting the region.
>
>
>
> DocData,0000001013071992,12 column=info:regioninfo,
> timestamp=1280305164242, value=REGION => {NAME => 'DocDat
>  79835733117                 a,0000001013071992,1279835733117', STARTKEY =>
> '0000001013071992', ENDKEY => '000
>                             0001013205991', ENCODED => 1962005300, TABLE =>
> {{NAME => 'DocData', MAX_FILESIZE
>                              => '4402341480', FAMILIES => [{NAME =>
> 'bigColumn', VERSIONS => '1', COMPRESSION
>                              => 'NONE', TTL => '2147483647', BLOCKSIZE =>
> '1048576', IN_MEMORY => 'false', BL
>                             OCKCACHE => 'false'}]}}
>  DocData,0000001013071992,12 column=info:server, timestamp=1280317959911,
> value=63.250.207.87:60020
>  79835733117
>  DocData,0000001013071992,12 column=info:serverstartcode,
> timestamp=1280317959911, value=1279926520261
>  79835733117
>
>
> Can you see what is wrong here?
>
> Thank you
> Vidhya
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message