hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Brown <tombrow...@gmail.com>
Subject Re: hbase corruption - missing region files in HDFS
Date Mon, 10 Dec 2012 18:07:51 GMT
Chris,

I really appreciate your detailed fix description!  I've run into
similar problems (due to old hardware and bad sectors) and could never
figure out how to fix a broken table. Hbck always seemed to just make
things worse until I would give up and recreate the table.

Can you publish your utility that you used to create valid/empty HFiles?

--Tom

On Sun, Dec 9, 2012 at 6:08 PM, Kevin O'dell <kevin.odell@cloudera.com> wrote:
> Chris,
>
> Thank you for the very descriptive update.
>
> On Sun, Dec 9, 2012 at 6:29 PM, Chris Waterson <waterson@maubi.net> wrote:
>
>> Well, I upgraded to 0.92.2, since the version I was running on (0.92.1)
>> didn't have those options for "hbck".
>>
>> That helped.
>>
>> It took me a while to realize that I had to make the root filesystem
>> writable so that "hbck
>> -repair" could create itself a directory.  So, once that was done, it at
>> least ran through to completion.
>>
>> But the problem persisted in that there were blocks in META that didn't
>> exist on the filesystem.  One poor region server was assigned the sad task
>> of attempting to open the non-existent directory, which it slavishly
>> reattempted again and again, filling its log with FileNotFoundException
>> stack traces.
>>
>> For example,
>>
>> 2012-12-09 00:14:33,315 ERROR
>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open
>> of
>> region=referrers,com.free-hdwallpapers.www/wallpapers/animals/mici/595718.jpg|com.free-hdwallpapers.www/wallpaper/animals/husky/270579,1354964606745.0c54fe59c58ddd6b34042ec98171bff7.
>> java.io.FileNotFoundException: File does not exist:
>> /hbase/referrers/2cb553c74d52ddcbf31940f6c7128c63/main/33f1fd9efb944c4e982ba719cd7dde84
>> etc., etc.
>>
>> In particular, the directory above "/hbase/referrers/2cb553...c63" simply
>> did not exist at all in HDFS.
>>
>> So I took matters into my own hands and created the missing
>> "/hbase/referrers/2cb553...c63" directory, its subdirectory "main", and
>> attempted to create a zero-length file "331fd9...e84".  This changed the
>> firehose of exceptions from FileNotFoundException to CorruptHFileException.
>>
>> So, I wrote a small program to emit a valid, empty HFile, and proceeded to
>> place these files at whatever places in HDFS that a FileNotFoundException
>> was being thrown.  After creating three or four of them, the exceptions
>> stopped.
>>
>> I then ran "hbck -repair" again, and upon completion it declared victory.
>>
>> Again, I suspect that I got myself into this problem because I ran a
>> machine out of disk space.  It's likely that most folks are more clever
>> than me, and so this problem hasn't arisen before. :)
>>
>>
>>
>>
>> On Dec 9, 2012, at 3:00 PM, "Kevin O'dell" <kevin.odell@cloudera.com>
>> wrote:
>>
>> > can you run hbase hbck -fixMeta -fixAssignments
>> >
>> > This should assign those region servers and fix the hole.
>> >
>> > On Sat, Dec 8, 2012 at 11:30 PM, Chris Waterson <waterson@maubi.net>
>> wrote:
>> >
>> >> Hello!  I've gotten myself into trouble where I'm missing files on HDFS
>> >> that HBase thinks ought to be there.  In particular, running "hbase
>> hbck"
>> >> yields the below message: two regions are "not deployed on any region
>> >> server" (because there is no file in HDFS for the region), and "there
>> is a
>> >> hole in the region chain".
>> >>
>> >> (FWIW, I suspect that this problem is due to a recent incident where we
>> >> ran the cluster out of disk space.)
>> >>
>> >> I'm running 0.92.1, and have been staggering around trying to figure out
>> >> what procedure I ought to use to correct the problem, but my Google-fu
>> is
>> >> too poor to have yielded results.  Any pointers would be appreciated!
>> >>
>> >> thanks,
>> >> chris
>> >>
>> >>
>> >>
>> >>
>> >> ERROR: Region
>> >>
>> referrers,com.free-hdwallpapers.www/wallpapers/animals/mici/595718.jpg|com.free-hdwallpapers.www/wallpaper/animals/husky/270579,1354964606745.0c54fe59c58ddd6b34042ec98171bff7.
>> >> not deployed on any region server.
>> >> ERROR: Region
>> >>
>> referrers,com.free-hdwallpapers.www/wallpapers/anime/mici/78285.jpg|com.free-hdwallpapers.www/wallpaper/anime/wolf-furry/90641,1354964606745.d2451e8db0f2b9546cc42c6d260a2ab8.
>> >> not deployed on any region server.
>> >> ERROR: There is a hole in the region chain between
>> >>
>> com.free-hdwallpapers.www/wallpapers/animals/mici/595718.jpg|com.free-hdwallpapers.www/wallpaper/animals/husky/270579
>> >> and
>> >>
>> com.free-hdwallpapers.www/wallpapers/entertainment/mici/11840.jpg|com.free-hdwallpapers.www/wallpaper/entertainment/new-moon-bella-and-edward/12951.
>> >> You need to create a new regioninfo and region dir in hdfs to plug the
>> >> hole.
>> >>
>> >>
>> >
>> >
>> > --
>> > Kevin O'Dell
>> > Customer Operations Engineer, Cloudera
>>
>>
>
>
> --
> Kevin O'Dell
> Customer Operations Engineer, Cloudera

Mime
View raw message