hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chetna Chaudhari (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-11735) Different results when multiple if() functions are used
Date Fri, 04 Sep 2015 09:32:45 GMT

     [ https://issues.apache.org/jira/browse/HIVE-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chetna Chaudhari updated HIVE-11735:
------------------------------------
    Description: 
Hive if() udf is returns different results when string equality is used as condition, with
case change. 
Observation:
   1) if( name = 'chetna' , 3, 4) and if( name = 'Chetna', 3, 4) both are treated as equal.
   2) The rightmost udf result is pushed to predicates on left side. Leading to same result
for both the udfs.

How to reproduce the issue:
1) CREATE TABLE `sample`(
  `name` string)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
TBLPROPERTIES (
  'transient_lastDdlTime'='1425075745');

2) insert into table sample values ('chetna');
3) select min(if(name = 'chetna', 4, 3)) , min(if(name='Chetna', 4, 3))  from sample; 
    This will give result : 
    3    3
    Expected result:
    4    3
4) select min(if(name = 'Chetna', 4, 3)) , min(if(name='chetna', 4, 3))  from sample; 
    This will give result 
    4    4
    Expected result:
    3    4




  was:
Hive if() udf is returning different results when string equality is used as condition, with
case change. 
Observation:
   1) if( name = 'chetna' , 3, 4) and if( name = 'Chetna', 3, 4) both are treated as equal.
   2) The rightmost udf result is pushed to predicates on left side. Leading to same result
for both the udfs.

How to reproduce the issue:
1) CREATE TABLE `sample`(
  `name` string)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
TBLPROPERTIES (
  'transient_lastDdlTime'='1425075745');

2) insert into table sample values ('chetna');
3) select min(if(name = 'chetna', 4, 3)) , min(if(name='Chetna', 4, 3))  from sample; 
    This will give result : 
    3    3
    Expected result:
    4    3
4) select min(if(name = 'Chetna', 4, 3)) , min(if(name='chetna', 4, 3))  from sample; 
    This will give result 
    4    4
    Expected result:
    3    4





> Different results when multiple if() functions are used 
> --------------------------------------------------------
>
>                 Key: HIVE-11735
>                 URL: https://issues.apache.org/jira/browse/HIVE-11735
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Chetna Chaudhari
>
> Hive if() udf is returns different results when string equality is used as condition,
with case change. 
> Observation:
>    1) if( name = 'chetna' , 3, 4) and if( name = 'Chetna', 3, 4) both are treated as
equal.
>    2) The rightmost udf result is pushed to predicates on left side. Leading to same
result for both the udfs.
> How to reproduce the issue:
> 1) CREATE TABLE `sample`(
>   `name` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1425075745');
> 2) insert into table sample values ('chetna');
> 3) select min(if(name = 'chetna', 4, 3)) , min(if(name='Chetna', 4, 3))  from sample;

>     This will give result : 
>     3    3
>     Expected result:
>     4    3
> 4) select min(if(name = 'Chetna', 4, 3)) , min(if(name='chetna', 4, 3))  from sample;

>     This will give result 
>     4    4
>     Expected result:
>     3    4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message