hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: How Long Will HBase Hold A Row Write Lock?
Date Thu, 01 Mar 2018 02:37:10 GMT
bq. timing out trying to obtain write locks on rows in that region.

Can you confirm that the region under contention was the one being major
compacted ?

Can you pastebin thread dump so that we can have better idea of the
scenario ?

For the region being compacted, how long would the compaction take (just
want to see if there was correlation between this duration and timeout) ?


On Wed, Feb 28, 2018 at 6:31 PM, Saad Mufti <saad.mufti@gmail.com> wrote:

> Hi,
> We are running on Amazon EMR based HBase 1.4.0 . We are currently seeing a
> situation where sometimes a particular region gets into a situation where a
> lot of write requests to any row in that region timeout saying they failed
> to obtain a lock on a row in a region and eventually they experience an IPC
> timeout. This causes the IPC queue to blow up in size as requests get
> backed up, and that region server experiences a much higher than normal
> timeout rate for all requests, not just those timing out for failing to
> obtain the row lock.
> The strange thing is the rows are always different but the region is always
> the same. So the question is, is there a region component to how long a row
> write lock would be held? I looked at the debug dump and the RowLocks
> section shows a long list of write row locks held, all of them are from the
> same region but different rows.
> Will trying to obtain a write row lock experience delays if no one else
> holds a lock on the same row but the region itself is experiencing read
> delays? We do have an incremental compaction tool running that major
> compacts one region per region server at a time, so that will drive out
> pages from the bucket cache. But for most regions the impact is
> transitional until the bucket cache gets populated by pages from the new
> HFile. But for this one region we start timing out trying to obtain write
> locks on rows in that region.
> Any insight anyone can provide would be most welcome.
> Cheers.
> ----
> Saad

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message