drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [drill] vvysotskyi commented on pull request #2092: DRILL-7763: Add Limit Pushdown to File Based Storage Plugins
Date Mon, 20 Jul 2020 08:33:35 GMT

vvysotskyi commented on pull request #2092:
URL: https://github.com/apache/drill/pull/2092#issuecomment-660886165


   @cgivre, there already was a similar functionality for Parquet: https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ParquetRecordReader.java#L53.
   Please take a look at [`AbstractGroupScanWithMetadata.applyLimit()`](https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractGroupScanWithMetadata.java#L453)
method - it contains required logic for pruning files. With your changes, you are overriding
it, so it breaks this functionality. To coexist with this feature, please take a look at the
implementation of this method for the parquet group scan, move common logic to AbstractGroupScanWithMetadata
and use it in the easy group scan.
   
   The behavior with the metastore usage is the following: if for example, we have 10 files
with 100 records, and query with limit 5 is submitted, only the single file would be left
in the group scan, since it contains all required records.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message