hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: PENDING_CLOSE for too long
Date Sat, 29 Oct 2011 23:19:08 GMT
In 0.92 (to be released in 2 weeks), you can expect improvement in this
regard.
See HBASE-3368.

Geoff:
Can you publish your tool on HBASE JIRA ?

Thanks

On Sat, Oct 29, 2011 at 2:35 PM, Geoff Hendrey <ghendrey@decarta.com> wrote:

> Sure. I posted the code many weeks back for a tool that will repair holes
> in .mETA.
>
> If you do a check on the list, you should find it. I'll send you the
> latest code for that. Maybe I made some fixes after I posted the code.
> Please ping me if I forget. I've used it to repair huge tables  (and fixed
> subtle bugs in the process) so I'm confident it works.
>
> No matter what anyone tells me, I know hbase is horribly broken for the
> use case of doing bulk writes from an mr job. It shits the bed every time
> you pass a certain scale. For this reason we've completely rewritten our
> code so that we use bulkloading. It's way more efficient and always work.
>
> Please ping me until I send you the code. Otherwise I will forget.
>
> Sent from my iPhone
>
> On Oct 29, 2011, at 1:39 PM, "Stuart Smith" <stu24mail@yahoo.com> wrote:
>
> > Hello Geoff,
> >
> >   I usually don't show up here, since I use CDH, and good form means I
> should stay on CDH-users,
> > But!
> >   I've been seeing the same issues for months:
> >
> >  - PENDING_CLOSE too long, master tries to reassign - I see an
> continuous stream of these.
> >  - WrongRegionExceptions due to overlapping regions & holes in the
> regions.
> >
> > I just spent all day yesterday cribbing off of St.Ack's check_meta.rb
> script to write a java program to fix up overlaps & holes in an offline
> fashion (hbase down, directly on hdfs), and will start testing next week
> (cross my fingers!).
> >
> > It seems like the pending close messages can be ignored?
> > And once I test my tool, and confirm I know a little bit about what I'm
> doing, maybe we could share notes?
> >
> > Take care,
> >   -stu
> >
> >
> >
> > ________________________________
> > From: Geoff Hendrey <ghendrey@decarta.com>
> > To: user@hbase.apache.org
> > Cc: hbase-user@hadoop.apache.org
> > Sent: Saturday, September 3, 2011 12:11 AM
> > Subject: RE: PENDING_CLOSE for too long
> >
> > "Are you having trouble getting to any of your data out in tables?"
> >
> > depends what you mean. We see corruptions from time to time that prevent
> > us from getting data, one way or another. Today's corruption was regions
> > with duplicate start and end rows. We fixed that by deleting the
> > offending regions from HDFS, and running add_table.rb to restore the
> > meta. The other common corruption is the holes in ".META." that we
> > repair with a little tool we wrote. We'd love to learn why we see these
> > corruptions with such regularity (seemingly much higher than others on
> > the list).
> >
> > We will implement timeout you suggest, and see how it goes.
> >
> > Thanks,
> > Geoff
> >
> > -----Original Message-----
> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> > Stack
> > Sent: Friday, September 02, 2011 10:51 PM
> > To: user@hbase.apache.org
> > Cc: hbase-user@hadoop.apache.org
> > Subject: Re: PENDING_CLOSE for too long
> >
> > Are you having trouble getting to any of your data out in tables?
> >
> > To get rid of them, try restarting your master.
> >
> > Before you restart your master, do "HBASE-4126  Make timeoutmonitor
> > timeout after 30 minutes instead of 3"; i.e. set
> > "hbase.master.assignment.timeoutmonitor.timeout" to 1800000 in
> > hbase-site.xml.
> >
> > St.Ack
> >
> > On Fri, Sep 2, 2011 at 1:40 PM, Geoff Hendrey <ghendrey@decarta.com>
> > wrote:
> > > In the master logs, I am seeing "regions in transition timed out" and
> > > "region has been PENDING_CLOSE for too long, running forced unasign".
> > > Both of these log messages occur at INFO level, so I assume they are
> > > innocuous. Should I be concerned?
> > >
> > >
> > >
> > > -geoff
> > >
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message