hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-25209) SELECT query with SUM function producing unexpected result
Date Tue, 06 Jul 2021 15:02:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-25209?focusedWorklogId=619399&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-619399
]

ASF GitHub Bot logged work on HIVE-25209:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Jul/21 15:01
            Start Date: 06/Jul/21 15:01
    Worklog Time Spent: 10m 
      Work Description: kgyrtkirk commented on a change in pull request #2360:
URL: https://github.com/apache/hive/pull/2360#discussion_r664637117



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java
##########
@@ -397,6 +397,11 @@ else if (getGbyKeyType(cgbyOp) == GbyKeyType.CONSTANT && rowCnt
== 0) {
           if (udaf instanceof GenericUDAFSum) {
             // long/double/decimal
             ExprNodeDesc desc = aggr.getParameters().get(0);
+            // return null for SUM(1), when the table is empty. Without this, category =
LONG, and the result is 0
+            // instead of NULL.
+            if (desc instanceof ExprNodeConstantDesc && rowCnt == 0) {
+              return oneRow.add(null);

Review comment:
       this `return` doesn't look right to me - it may work correctly if you around only selecting
the aggregate - but what happens if you have more?
   I believe you should execute `oneRow.add(null)` ; bail out the rest of this conditional
- and let the code continue with the remaining aggregates if there is any




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 619399)
    Time Spent: 1h 10m  (was: 1h)

> SELECT query with SUM function producing unexpected result
> ----------------------------------------------------------
>
>                 Key: HIVE-25209
>                 URL: https://issues.apache.org/jira/browse/HIVE-25209
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Soumyakanti Das
>            Assignee: Soumyakanti Das
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Hive: SELECT query with SUM function producing unexpected result
> Problem Statement:
> {noformat}
> SELECT SUM(1) FROM t1;
> ---- result: 0
> SELECT SUM(agg0) FROM (
>     SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 UNION ALL 
>     SELECT SUM(1) as agg0 FROM t1 WHERE NOT (t1.c0) UNION ALL 
>     SELECT SUM(1) as agg0 FROM t1 WHERE (t1.c0) IS NULL
>     ) as asdf;
> ---- result: null {noformat}
> Steps to reproduce:
> {noformat}
> DROP DATABASE IF EXISTS db5 CASCADE;
> CREATE DATABASE db5;
> use db5;
> CREATE TABLE IF NOT EXISTS t1(c0 boolean, c1 boolean);
> SELECT SUM(1) FROM t1;
> -- result: 0
> SELECT SUM(agg0) FROM (
>     SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 UNION ALL 
>     SELECT SUM(1) as agg0 FROM t1 WHERE NOT (t1.c0) UNION ALL 
>     SELECT SUM(1) as agg0 FROM t1 WHERE (t1.c0) IS NULL
>     ) as asdf;
> -- result: null {noformat}
> Observations:
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 = t1.c1; – will result in null
> Similarity with postgres, 
>  both the queries result in null
> Similarity with Impala,
>  both the queries result in null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message