hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-21327) Predicate is not pushed to Parquet if hive.parquet.timestamp.skip.conversion=true
Date Fri, 01 Mar 2019 02:28:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-21327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781215#comment-16781215
] 

Hive QA commented on HIVE-21327:
--------------------------------

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 39s{color}
| {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 20s{color} |
{color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 44s{color}
| {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 43s{color} | {color:blue}
ql in master has 2251 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 12s{color} |
{color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 47s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 22s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 22s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 44s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 54s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 13s{color} |
{color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 16s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 30s{color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03)
x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16301/dev-support/hive-personality.sh
|
| git revision | master / 6831b08 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16301/yetus.txt |
| Powered by | Apache Yetus    http://yetus.apache.org |


This message was automatically generated.



> Predicate is not pushed to Parquet if hive.parquet.timestamp.skip.conversion=true
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-21327
>                 URL: https://issues.apache.org/jira/browse/HIVE-21327
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 4.0.0
>            Reporter: Marta Kuczora
>            Assignee: Marta Kuczora
>            Priority: Major
>         Attachments: HIVE-21327.1.patch
>
>
> The Parquet FilterPredicate is created and set to the configuration in the ParquetRecordReaderBase.setFilter
method. This method is used from the ParquetRecordReaderWrapper constructor through the ParquetRecordReaderBase.getSplit
method and expects a JobConf as parameter where it sets the created filter predicate. In
the ParquetRecordReaderWrapper constructor, multiple JobConf object is used:
> {noformat}
>     jobConf = oldJobConf;
>     final ParquetInputSplit split = getSplit(oldSplit, jobConf);
>     TaskAttemptID taskAttemptID = TaskAttemptID.forName(jobConf.get(IOConstants.MAPRED_TASK_ID));
>     if (taskAttemptID == null) {
>       taskAttemptID = new TaskAttemptID();
>     }
>     // create a TaskInputOutputContext
>     Configuration conf = jobConf;
>     if (skipTimestampConversion ^ HiveConf.getBoolVar(
>         conf, HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION)) {
>       conf = new JobConf(oldJobConf);
>       HiveConf.setBoolVar(conf,
>         HiveConf.ConfVars.HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION, skipTimestampConversion);
>     }
>     final TaskAttemptContext taskContext = ContextUtil.newTaskAttemptContext(conf, taskAttemptID);
> {noformat}
> So we have the jobConf, oldJobConf and conf objects and the getSplit is called with the
jobConf object, so the filter predicate will be set into this config object. Based on this
code part, the jobConf and oldJobConf should be the same reference inside the if statement,
so the newly created conf should also contain the filter predicate. However in the getSplit
method the value of the jobConf is changed by the projectionPusher.pushProjectionsAndFilters
method, so inside the if statement, the jobConf and the oldJobConf are actually different
references. The filter predicate is set in the jobConf, but if the if condition is true, the
conf will be created from the oldJobConf so it won't contain the filter predicate.
> Just for reference, this behavior was introduced in [HIVE-9873|https://issues.apache.org/jira/browse/HIVE-9873].

> Since the goal of the if statement is only to update the HIVE_PARQUET_TIMESTAMP_SKIP_CONVERSION
property in the configuration, it should be using the jobConf where the filter predicate is
correctly set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message