phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-3705) SkipScanFilter may repeatedly copy rowKey Columns to startKey
Date Fri, 03 Mar 2017 12:48:45 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894274#comment-15894274
] 

Hadoop QA commented on PHOENIX-3705:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12855828/PHOENIX-3705_v2.patch
  against master branch at commit cf65fb27edf62666691500e3f7e7549c4b83240f.
  ATTACHMENT ID: 12855828

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 3 new or modified
tests.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 45 warning messages.

    {color:red}-1 release audit{color}.  The applied patch generated 1 release audit warnings
(more than the master's current 0 warnings).

    {color:red}-1 lineLengths{color}.  The patch introduces the following lines longer than
100:
    +                    //for PHOENIX-3705, now ptr is still point to slot i, we must make
ptr point to slot j+1,
+                    //because following setStartKey method will copy rowKey columns before
ptr to startKey and
+                    //then copy the lower bound of slots from j+1, according to position
array, so if we do not
+                    //make ptr point to the first rowKey column of slot j,why we need slotSpan[j]
because for Row Value Constructor(RVC),
+                    //slot j may span multiple rowKey columns, so the length of ptr must
consider the slotSpan[j].
+                    schema.iterator(startKey, minOffset, maxOffset, ptr, ScanUtil.getRowKeyPosition(slotSpan,
j)+1,slotSpan[j]);
+    public Boolean iterator(byte[] src, int srcOffset, int srcLength, ImmutableBytesWritable
ptr, int position,int extraColumnSpan) {
+    public Boolean iterator(byte[] src, int srcOffset, int srcLength, ImmutableBytesWritable
ptr, int position) {
+    public SkipScanFilterTest(List<List<KeyRange>> cnf, int[] widths, int[] slotSpans,List<Expectation>
expectations) {
+                        PInteger.INSTANCE.getKeyRange(PInteger.INSTANCE.toBytes(1), true,
PInteger.INSTANCE.toBytes(4), true)

     {color:red}-1 core tests{color}.  The patch failed these unit tests:
     ./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.MutableIndexFailureIT

Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/791//testReport/
Release audit warnings: https://builds.apache.org/job/PreCommit-PHOENIX-Build/791//artifact/patchprocess/patchReleaseAuditWarnings.txt
Javadoc warnings: https://builds.apache.org/job/PreCommit-PHOENIX-Build/791//artifact/patchprocess/patchJavadocWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/791//console

This message is automatically generated.

> SkipScanFilter may repeatedly copy rowKey Columns to startKey
> -------------------------------------------------------------
>
>                 Key: PHOENIX-3705
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3705
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.9.0
>            Reporter: chenglei
>            Priority: Blocker
>             Fix For: 4.10.0
>
>         Attachments: PHOENIX-3705_v2.patch
>
>
> See following simple unit test first,the rowKey is composed of three PInteger columns,and
the slots of SkipScanFilter are:
> [ [[1 - 4]], [5, 7], [[9 - 10]] ],
> When SkipScanFilter.filterKeyValue method is invoked on a KeyValue whose rowKey is [2,7,11],
obviously SkipScanFilter.filterKeyValue
> returns ReturnCode.SEEK_NEXT_USING_HINT and SkipScanFilter.getNextCellHint returns  [3,5,9],
but unfortunately, SkipScanFilter.getNextCellHint actually returns  [2,8,5,9] , a very strange
value, the unit tests failed.
> {code} 
>     @Test
>     public void testNavigate() {
>         RowKeySchemaBuilder builder = new RowKeySchemaBuilder(3);
>         for(int i=0;i<3;i++) {
>             builder.addField(
>                     new PDatum() {
>                         @Override
>                         public boolean isNullable() {
>                             return false;
>                         }
>                         @Override
>                         public PDataType getDataType() {
>                             return PInteger.INSTANCE;
>                         }
>                         @Override
>                         public Integer getMaxLength() {
>                             return PInteger.INSTANCE.getMaxLength(null);
>                         }
>                         @Override
>                         public Integer getScale() {
>                             return PInteger.INSTANCE.getScale(null);
>                         }
>                         @Override
>                         public SortOrder getSortOrder() {
>                             return SortOrder.getDefault();
>                         }
>                     }, false, SortOrder.getDefault());
>         }
>         
>         List<List<KeyRange>> rowKeyColumnRangesList=Arrays.asList(      
>                 Arrays.asList(
>                     PInteger.INSTANCE.getKeyRange(PInteger.INSTANCE.toBytes(1), true,
PInteger.INSTANCE.toBytes(4), true)),
>                 Arrays.asList(
>                     KeyRange.getKeyRange(PInteger.INSTANCE.toBytes(5)),
>                     KeyRange.getKeyRange(PInteger.INSTANCE.toBytes(7))),
>                 Arrays.asList(
>                     PInteger.INSTANCE.getKeyRange(PInteger.INSTANCE.toBytes(9), true,
PInteger.INSTANCE.toBytes(10), true))
>         );
>         
>         SkipScanFilter skipScanFilter=new SkipScanFilter(rowKeyColumnRangesList, builder.build());
>         
>         System.out.println(skipScanFilter);
>         
>         byte[] rowKey=ByteUtil.concat(
>                 PInteger.INSTANCE.toBytes(2), 
>                 PInteger.INSTANCE.toBytes(7),
>                 PInteger.INSTANCE.toBytes(11));
>         KeyValue keyValue=KeyValue.createFirstOnRow(rowKey);
>         ReturnCode returnCode=skipScanFilter.filterKeyValue(keyValue);
>         assertTrue(returnCode == ReturnCode.SEEK_NEXT_USING_HINT);
>         Cell nextCellHint=skipScanFilter.getNextCellHint(keyValue);
>         assertTrue(Bytes.toStringBinary(CellUtil.cloneRow(nextCellHint)).equals(
>         "\\x80\\x00\\x00\\x03\\x80\\x00\\x00\\x05\\x80\\x00\\x00\\x09"));
>     }
> {code}
> Let us see what's wrong, first column of rowKey [2,7,11 ] is 2, which is in SkipScanFilter's
first slot range [1-4], so position[0] is 0 and we go to the second column 7, which match
the second range [7] of SkipScanFilter's second slot [5, 7],so position[1] is 1 and we go
to the third column 11, which is bigger than the third slot range [9 - 10],so position[2]
is 0 and the {{SkipScanFilter.ptr}} which points to current column still stay on the third
column. Now we begin to backtrack to second column, because the second range [7] of SkipScanFilter's
second slot is singleKey and there is no more range,so position[1] is 0 and we continue to
backtrack to first column, because the first slot range [1-4] is not singleKey so we stop
backtracking at first column.
> Now the problem comes, in following line 448 of {{SkipScanFilter.navigate}} method,{{SkipScanFilter.setStartKey}}
method is invoked,first copy rowKey columns before {{SkipScanFilter.ptr}} to {{SkipScanFilter.startKey}},
because now {{SkipScanFilter.ptr}} still point to the third column, so copy the first and
second columns to {{SkipScanFilter.startKey}},{{SkipScanFilter.startKey}} is [2,7]  after
this step , then setStartKey method copy the lower bound {{SkipScanFilter.slots}} from {{j+1}}
column, accoring to {{SkipScanFilter.position}} array,because now j is 0, and both position[1]
and position[2] are 0, so {{SkipScanFilter.startKey}} becomes [2,7,5,9], and in following
line 457, {{ByteUtil.nextKey}} is invoked on [2,7], [2,7] is incremented to [2,8], finally
{{SkipScanFilter.startKey}} is [2,8,5,9].
> {code} 
> 448                    int currentLength = setStartKey(ptr, minOffset, j+1, nSlots, false);
> 449                    // From here on, we use startKey as our buffer (resetting minOffset
and maxOffset)
> 450                    // We've copied the part of the current key above that we need
into startKey
> 451                    // Reinitialize the iterator to be positioned at previous slot
position
> 452                    minOffset = 0;
> 453                    maxOffset = startKeyLength;
> 454                    schema.iterator(startKey, minOffset, maxOffset, ptr, ScanUtil.getRowKeyPosition(slotSpan,
j)+1);
> 455                    // Do nextKey after setting the accessor b/c otherwise the null
byte may have
> 456                    // been incremented causing us not to find it
> 457                    ByteUtil.nextKey(startKey, currentLength);
> {code} 
> From above analysis, we can see the second column is repeatedly copied to {{SkipScanFilter.startKey}},
copy 7 to {{SkipScanFilter.startKey}} is error ,before {{SkipScanFilter.setStartKey}} method
is invoked in line 448, we should first move {{SkipScanFilter.ptr}} to {{j+1}}, that is to
say ,following code might be inserted before line 448:
> {code} 
>                   schema.reposition(
>                             ptr, 
>                             ScanUtil.getRowKeyPosition(slotSpan, i), 
>                             ScanUtil.getRowKeyPosition(slotSpan, j+1), 
>                             minOffset, 
>                             maxOffset, 
>                             slotSpan[j+1]);
> {code} 
>  Another problem is why the existing SkipScanFilter unit tests are ok? That is because
in all existing unit test cases, backtrack column is just before current column which ptr
points to,  i.e. i==j+1, so the unit tests are ok.
> This issue may not affect the final query result , but obviously, it may lead to the
erroneous startKey and  cause the Skip Scan inefficient.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message