hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vaibhav Puranik <vpura...@gmail.com>
Subject Re: A region full of data is missing
Date Wed, 11 Nov 2009 01:50:02 GMT
This problem is resolved. Courtesy  Ryan, JD and Stack.Thank you very much!

For the culprit region there were two data files instead of one data file.
The size of the first data file was around 130 MB. The second file was just
228 bytes.
Because of a bug this second file gets created during major compaction. That
prevents the region from loading properly.

As Ryan asked me to do, I deleted the smaller file, closed the region. This
time HBase reopened it properly and the missing data came back up. I could
access the missing data.

I am not sure whether the bug is
https://issues.apache.org/jira/browse/HBASE-1686 as this bug has  a fixed
version of 0.20 but we already have 0.20.0 deployed in production.
But as per Ryan the root cause is the same.

I guess we need to upgrade our HBase to 0.20.1!

Thanks again,
Vaibhav Puranik
Gumgum









On Tue, Nov 10, 2009 at 2:48 PM, Vaibhav Puranik <vpuranik@gmail.com> wrote:

> Region name contains table name, start key and an id.
> Start key is binary. In our case it was a mixture of few longs. Whenever
> printed, it always prints Unicode characters which looks like a junk or
> garbled characters. I am not sure whether shell can interpret it correctly.
>
> I don't know how to give this name on the shell console hence I used the
> HBaseAdmin method.
>
> I kept watching logs while I was doing it. The logs said it closed the
> region and reopened it. It reopened it on the same region server.
>
> I tried accessing data after this, but it didn't work.
>
> .META. table seems to have its entry. The entry looks like:
>
>   column=historian:assignment, timestamp=1257889883623, value=Region
> assigned to server
> domU-12-32-38-01-24-F2.z-2.compute-1.internal,60020,1253581834090
>
>
>  column=historian:open, timestamp=1257889886631, value=Region opened on
> server :
> domU-12-32-38-01-24-F2.z-2.compute-1.internal
>
>
> column=info:regioninfo, timestamp=1250406167893, value=REGION => {NAME =>
> 'Visits
>  \337\347\000\000\000\000\00
> ,\\x00\\x00\\x01\\x22\\xD2\\x1B\\xDF\\xE7\\x00\\x00\\x00\\x00\\x00\\x02\\xAF\\xFE
>  0\002\257\376,1250406166412 ,1250406166412', STARTKEY =>
> '\\x00\\x00\\x01\\x22\\xD2\\x1B\\xDF\\xE7\\x00\\x00\
>                              \x00\\x00\\x00\\x02\\xAF\\xFE', ENDKEY =>
> '\\x00\\x00\\x01\\x22\\xFC\\x27\\x0F8\\
>                              x00\\x00\\x00\\x00\\x00\\x05X:', ENCODED =>
> 1887697866, TABLE => {{NAME => 'Visit
>                              s', FAMILIES => [{NAME => 'data', VERSIONS =>
> '3', COMPRESSION => 'NONE', TTL =>
>                              '2147483647', BLOCKSIZE => '65536', IN_MEMORY
> => 'false', BLOCKCACHE => 'true'}]}
>
> }
>
>  column=info:server, timestamp=1257889886630, value=10.255.43.0:60020
>
>
> column=info:serverstartcode, timestamp=1257889886630,
> value=1253581834090
>
>
> Regards,
> Vaibhav
>
>
>
>
>
> On Tue, Nov 10, 2009 at 2:28 PM, stack <stack@duboce.net> wrote:
>
>> You couldn't run the shell?
>>
>> So, region closed and opened somewhere else?  Open on another regionserver
>> and you still can't get data out of it?
>>
>> St.Ack
>>
>>
>> On Tue, Nov 10, 2009 at 2:11 PM, Vaibhav Puranik <vpuranik@gmail.com>
>> wrote:
>>
>> > Stack,
>> >
>> > I tried doing HBaseAdmin.closeRegion with the binary region name.
>> >
>> > It closed the region and reopened it. But we still can not access the
>> data.
>> >
>> > I guess trying to read it back from the data file is the only option
>> left,
>> > right?
>> >
>> > Regards,
>> > Vaibhav
>> >
>> > On Tue, Nov 10, 2009 at 12:56 PM, stack <stack@duboce.net> wrote:
>> >
>> > > On Mon, Nov 9, 2009 at 6:40 PM, Vaibhav Puranik <vpuranik@gmail.com>
>> > > wrote:
>> > >
>> > > >  Does that mean the region is
>> > > > open and needs to be closed?
>> > > >
>> > > > It means region should be open... especially if its the message the
>> > > regionserver is passing back to the Master reporting successful open.
>> > >  Maybe
>> > > check the regionserver log to see if anything happened with the region
>> > > subsequently?
>> > >
>> > >
>> > >
>> > > > All the other regions seems to have one file in their data
>> directory.
>> > > This
>> > > > region has two files in its data directory.
>> > > > Is that right?
>> > > >
>> > >
>> > > Over time, varies.  These are the files that carry the data.  When
>> number
>> > > hits a threshold, they are compacted into one file.
>> > >
>> > > So, did close work?
>> > >
>> > > If not, you can find the region in the fileystem?  If so, if any good
>> w/
>> > > ruby, see the add_table.rb script in head of the 0.20 branch.  See how
>> it
>> > > can read a region and add an entry for it to .META.  You might be able
>> to
>> > > hack it up to do the one region if the close doesn't work.
>> > >
>> > > St.Ack
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message