hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-21509) LLAP may cache corrupted column vectors and return wrong query result
Date Mon, 01 Apr 2019 10:09:03 GMT

    [ https://issues.apache.org/jira/browse/HIVE-21509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16806601#comment-16806601
] 

Hive QA commented on HIVE-21509:
--------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 57s{color} | {color:blue}
Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 59s{color}
| {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 38s{color} |
{color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 26s{color}
| {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 33s{color} | {color:blue}
storage-api in master has 48 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 49s{color} | {color:blue}
llap-server in master has 81 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 28s{color} |
{color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 30s{color} | {color:blue}
Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 23s{color} | {color:red}
llap-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 23s{color} | {color:red}
llap-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 23s{color} | {color:red}
llap-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 15s{color} | {color:red}
llap-server: The patch generated 4 new + 29 unchanged - 1 fixed = 33 total (was 30) {color}
|
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 22s{color} | {color:red}
llap-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 28s{color} |
{color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 15s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 15s{color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03)
x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-16793/dev-support/hive-personality.sh
|
| git revision | master / 7bbd93f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| mvninstall | http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/patch-mvninstall-llap-server.txt
|
| compile | http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/patch-compile-llap-server.txt
|
| javac | http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/patch-compile-llap-server.txt
|
| checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/diff-checkstyle-llap-server.txt
|
| findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus/patch-findbugs-llap-server.txt
|
| modules | C: storage-api llap-server U: . |
| Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-16793/yetus.txt |
| Powered by | Apache Yetus    http://yetus.apache.org |


This message was automatically generated.



> LLAP may cache corrupted column vectors and return wrong query result
> ---------------------------------------------------------------------
>
>                 Key: HIVE-21509
>                 URL: https://issues.apache.org/jira/browse/HIVE-21509
>             Project: Hive
>          Issue Type: Bug
>          Components: llap
>            Reporter: Adam Szita
>            Assignee: Adam Szita
>            Priority: Major
>         Attachments: HIVE-21509.0.wip.patch, HIVE-21509.1.wip.patch, HIVE-21509.2.patch,
HIVE-21509.3.patch
>
>
> In some scenarios, LLAP might store column vectors in cache that are getting reused and
reset just before their original content would be written.
> The issue is a concurrency issue and is thereby flaky. It is not easy to reproduce, but
the odds of surfacing this issue can by improved by setting LLAP executor and IO thread counts
this way:
>  * set hive.llap.daemon.num.executors=32;
>  * set hive.llap.io.threadpool.size=1;
>  * using TPCDS input data of store_sales table, have at least a couple of 100k's of rows,
and use text format:
> {code:java}
> ROW FORMAT SERDE    'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'  WITH SERDEPROPERTIES
(    'field.delim'='|',    'serialization.format'='|')  STORED AS INPUTFORMAT    'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT    'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'{code}
>  * having more splits increases the issue showing itself, so it is worth to _set tez.grouping.min-size=1024;
set tez.grouping.max-size=1024;_
>  * run query on this this table: select min(ss_sold_date_sk) from store_sales;
> The first query result is correct (2450816 in my case). Repeating the query will trigger
reading from LLAP cache and produce a wrong result: 0.
> If one wants to make sure of running into this issue, place a Thread.sleep(250) at
the beginning of VectorDeserializeOrcWriter#run().
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message