phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-1146) Detect stale client region cache on server and retry scans in split regions
Date Tue, 05 Aug 2014 15:30:12 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086380#comment-14086380
] 

Andrew Purtell commented on PHOENIX-1146:
-----------------------------------------

bq. I do think that'd be an improvement, as it seems that the client cannot always recover
correctly based on the conversation in HBASE-11667. 

Yes, if the coprocessor provides a synthetic key to the client instead of a real row key at
the current scan location then the client won't recover at this time. Still discussing what,
if anything, can be done there. Also, we could consider an enhancement JIRA that implements
a convention for coprocessors to hint desired next action to a client scanner. 

> Detect stale client region cache on server and retry scans in split regions
> ---------------------------------------------------------------------------
>
>                 Key: PHOENIX-1146
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1146
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0, 3.1, 4.1
>            Reporter: James Taylor
>            Assignee: James Taylor
>
> HBase cannot recover correctly from an aggregate scan run on the coprocessor side (see
HBASE-116670). This can lead to incorrect query results the first time a query is run after
a split occurs (due to the region boundary cache being stale). Phoenix can work around this
by:
> - detecting on server before the scan starts that the region cache used by the client
is out-of-date. This can be done up-front because the start/stop row of the scan should never
span across a region boundary. In this case, a DoNotRetryIOException is thrown with some embedded
information to cause a StaleRegionBoundaryCacheException to be thrown on the client.
> - catching this exception on the client (in ParallelIterators), refreshing the region
boundary cache, and re-running the necessary scans based on the new region boundaries.
> - detecting if this happens more than N times to prevent any kind of excessive looping
due to splits occurring over and over again.
> Phoenix has additional requirements above and beyond standard HBase clients, so even
if HBase could recover from this situation, Phoenix would likely need this workaround to ensure
that a scan does not span across region boundaries. This is required when the client is doing
a merge sort on the results of the parallel scans, mainly in ORDER BY (including topN) and
local indexing, and potentially GROUP BY if we move toward sorting the distinct groups on
the server side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message