phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-2940) Remove STATS RPCs from rowlock
Date Fri, 17 Jun 2016 19:19:05 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15336758#comment-15336758
] 

Josh Elser commented on PHOENIX-2940:
-------------------------------------

{quote}
Change ConnectionQueryServices.invalidateStats(), ConnectionQueryServicesImpl.addTableStats(),
ConnectionQueryServicesImpl.getTableStats(), and TableStatsCache.put() to all be consistent
and use ImmutableBytesPtr as the arg as it's possible you'd want to get the stats without
having a PTable.
Remove TableStatsCache.put(PTable).
{quote}

Replacing with {{byte[]}} or {{ImmutableBytesPtr}}? I see {{byte[]}} primarily in use by {{ConnectionQueryServices}}.
Unless I hear otherwise from ya, I'll go the 'consistency with what's already there' route
:)

bq. Would it be possible to remove repeated PTableStats guidePosts = 12 from phoenix-protocol/src/main/PTable.proto
without affecting b/w compat?

Older client talking to newer server: The server would send a PTable from the cache without
the stats field, so the client would just think that it's missing. The old client would construct
a PTableStatsImpl with an empty list of guideposts

Newer client talking to older server: The client would ignore the stats sent in the PTable
protobuf and query it on its own.

So, the only concern I can think of is preventing any future use of the identifier {{12}}
in PTable. If that would happen in some later Phoenix release it could break older clients.
The protobuf 2 docs actually have a section:

{quote}
Non-required fields can be removed, as long as the tag number is not used again in your updated
message type. You may want to rename the field instead, perhaps adding the prefix "OBSOLETE_",
or make the tag reserved, so that future users of your .proto can't accidentally reuse the
number. 
{quote}

I could remove it and leave a big-fat-warning to not reuse the number 12 (and we'd just need
to be aware of it for a few releases in code-reviews to prevent someone from trying to be
smart). How does that strike you?

> Remove STATS RPCs from rowlock
> ------------------------------
>
>                 Key: PHOENIX-2940
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2940
>             Project: Phoenix
>          Issue Type: Improvement
>         Environment: HDP 2.3 + Apache Phoenix 4.6.0
>            Reporter: Nick Dimiduk
>            Assignee: Josh Elser
>             Fix For: 4.8.0
>
>         Attachments: PHOENIX-2940.001.patch, PHOENIX-2940.002.patch, PHOENIX-2940.003.patch,
PHOENIX-2940.004.patch
>
>
> We have an unfortunate situation wherein we potentially execute many RPCs while holding
a row lock. This is problem is discussed in detail on the user list thread ["Write path blocked
by MetaDataEndpoint acquiring region lock"|http://search-hadoop.com/m/9UY0h2qRaBt6Tnaz1&subj=Write+path+blocked+by+MetaDataEndpoint+acquiring+region+lock].
During some situations, the [MetaDataEndpoint|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L492]
coprocessor will attempt to refresh it's view of the schema definitions and statistics. This
involves [taking a rowlock|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2862],
executing a scan against the [local region|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L542],
and then a scan against a [potentially remote|https://github.com/apache/phoenix/blob/10909ae502095bac775d98e6d92288c5cad9b9a6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L964]
statistics table.
> This issue is apparently exacerbated by the use of user-provided timestamps (in my case,
the use of the ROW_TIMESTAMP feature, or perhaps as in PHOENIX-2607). When combined with other
issues (PHOENIX-2939), we end up with total gridlock in our handler threads -- everyone queued
behind the rowlock, scanning and rescanning SYSTEM.STATS. Because this happens in the MetaDataEndpoint,
the means by which all clients refresh their knowledge of schema, gridlock in that RS can
effectively stop all forward progress on the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message