hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: what can cause RegionTooBusyException?
Date Tue, 11 Nov 2014 17:13:50 GMT
For your first question, region server web UI,
rs-status#regionRequestStats, shows Write Request Count.

You can monitor the value for the underlying region to see if it receives
above-normal writes.

Cheers

On Mon, Nov 10, 2014 at 4:06 PM, Brian Jeltema <bdjeltema@gmail.com> wrote:

> > Was the region containing this row hot around the time of failure ?
>
> How do I measure that?
>
> >
> > Can you check region server log (along with monitoring tool) what
> memstore pressure was ?
>
> I didn't see anything in the region server logs to indicate a problem. And
> given the
> reproducibility of the behavior, it's hard to see how dynamic parameters
> such as
> memory pressure could be at the root of the problem.
>
> Brian
>
> On Nov 10, 2014, at 3:22 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > Was the region containing this row hot around the time of failure ?
> >
> > Can you check region server log (along with monitoring tool) what
> memstore pressure was ?
> >
> > Thanks
> >
> > On Nov 10, 2014, at 11:34 AM, Brian Jeltema <
> brian.jeltema@digitalenvoy.net> wrote:
> >
> >>> How many tasks may write to this row concurrently ?
> >>
> >> only 1 mapper should be writing to this row. Is there a way to check
> which
> >> locks are being held?
> >>
> >>> Which 0.98 release are you using ?
> >>
> >> 0.98.0.2.1.2.1-471-hadoop2
> >>
> >> Thanks
> >> Brian
> >>
> >> On Nov 10, 2014, at 2:21 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >>
> >>> There could be more than one reason where RegionTooBusyException is
> thrown.
> >>> Below are two (from HRegion):
> >>>
> >>> * We throw RegionTooBusyException if above memstore limit
> >>> * and expect client to retry using some kind of backoff
> >>> */
> >>> private void checkResources()
> >>>
> >>> * Try to acquire a lock.  Throw RegionTooBusyException
> >>>
> >>> * if failed to get the lock in time. Throw InterruptedIOException
> >>>
> >>> * if interrupted while waiting for the lock.
> >>>
> >>> */
> >>>
> >>> private void lock(final Lock lock, final int multiplier)
> >>>
> >>> How many tasks may write to this row concurrently ?
> >>>
> >>> Which 0.98 release are you using ?
> >>>
> >>> Cheers
> >>>
> >>> On Mon, Nov 10, 2014 at 11:10 AM, Brian Jeltema <
> >>> brian.jeltema@digitalenvoy.net> wrote:
> >>>
> >>>> I’m running a map/reduce job against a table that is performing a
> large
> >>>> number of writes (probably updating every row).
> >>>> The job is failing with the exception below. This is a solid failure;
> it
> >>>> dies at the same point in the application,
> >>>> and at the same row in the table. So I doubt it’s a conflict with
> >>>> compaction (and the UI shows no compaction in progress),
> >>>> or that there is a load-related cause.
> >>>>
> >>>> ‘hbase hbck’ does not report any inconsistencies. The
> >>>> ‘waitForAllPreviousOpsAndReset’ leads me to suspect that
> >>>> there is operation in progress that is hung and blocking the update.
I
> >>>> don’t see anything suspicious in the HBase logs.
> >>>> The data at the point of failure is not unusual, and is identical to
> many
> >>>> preceding rows.
> >>>> Does anybody have any ideas of what I should look for to find the
> cause of
> >>>> this RegionTooBusyException?
> >>>>
> >>>> This is Hadoop 2.4 and HBase 0.98.
> >>>>
> >>>> 14/11/10 13:46:13 INFO mapreduce.Job: Task Id :
> >>>> attempt_1415210751318_0010_m_000314_1, Status : FAILED
> >>>> Error:
> >>>> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
> Failed
> >>>> 1744 actions: RegionTooBusyException: 1744 times,
> >>>>      at
> >>>>
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:207)
> >>>>      at
> >>>>
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1700(AsyncProcess.java:187)
> >>>>      at
> >>>>
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1568)
> >>>>      at
> >>>>
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:1023)
> >>>>      at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:995)
> >>>>      at org.apache.hadoop.hbase.client.HTable.put(HTable.java:953)
> >>>>
> >>>> Brian
> >>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message