phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-3073) Auto-detect when SMALL hint should be applied
Date Thu, 16 Mar 2017 20:04:41 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15928762#comment-15928762
] 

Hadoop QA commented on PHOENIX-3073:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12817948/3073-try.txt
  against master branch at commit 8093d10f1a481101d6c93fdf0744ff15ec48f4aa.
  ATTACHMENT ID: 12817948

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include any new or modified
tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    {color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/806//console

This message is automatically generated.

> Auto-detect when SMALL hint should be applied
> ---------------------------------------------
>
>                 Key: PHOENIX-3073
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3073
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Junegunn Choi
>            Assignee: Junegunn Choi
>         Attachments: 3073-try.txt, PHOENIX-3073.patch
>
>
> While comparing Phoenix JDBC client to the native HBase Java client, I noticed that Phoenix
client uses significantly more CPU time on the client machine. Profiling revealed that the
majority of the time was spent on {{BaseResultIterators.getParallelScans()}}. This was surprising
to me as I was only testing with simple point lookup queries.
> Here's how I tested:
> - {{SELECT /*+ SMALL SERIAL */ ID, DOCID FROM IMAGE WHERE ID = ?}}
>     - {{IMAGE}} is a salted table with 100 salt buckets
>     - {{ID}}, the primary key, was randomly selected in a small range so that the requests
are served without disk I/O
> - 20K/sec concurrent requests using 128 threads
> {{getParallelScans()}} is quite expensive as it iterates over all regions of the table
which can be many, only to return a single Scan object for this query. Since such a single-key
point lookup is one of the most frequent type of requests in a typical OLTP application, I
believe it makes sense to have a fast path for it. With the patch, the average CPU usage of
the client during the workload dropped to 18.8% from 56.7% before the patch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message