hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pengcheng Xiong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12744) GROUPING__ID failed to be recognized in multiple insert
Date Mon, 28 Dec 2015 23:05:49 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073242#comment-15073242
] 

Pengcheng Xiong commented on HIVE-12744:
----------------------------------------

[~busyjay], i put a patch here. A simple solution is to set hive.multigroupby.singlereducer=false.
The problem is like this. So in your case, you would like to use multi-insert and each of
the insert contains a group by. The group by contains grouping sets. By default, Hive turns
on "HIVEMULTIGROUPBYSINGLEREDUCER" flag. When this flag is on, Hive tries to optimize multi
group by query to generate single M/R  job plan. In the single M/R plan, there is no map-side
aggr. However, "Grouping sets aggregations (with rollups or cubes) are not allowed if map-side
aggregation is turned off. Set hive.map.aggr=true if you want to use grouping sets". Thus,
you have to choose between grouping sets and multi group by optimization. Thus, i would recommend
turn the optimization off, i.e.,  set hive.multigroupby.singlereducer=false. Thanks.

> GROUPING__ID failed to be recognized in multiple insert
> -------------------------------------------------------
>
>                 Key: HIVE-12744
>                 URL: https://issues.apache.org/jira/browse/HIVE-12744
>             Project: Hive
>          Issue Type: Bug
>          Components: Parser
>    Affects Versions: 1.2.1
>         Environment: apache hive 1.2.1
> apache hadoop 2.6.2
>            Reporter: Jay Lee
>            Assignee: Pengcheng Xiong
>         Attachments: HIVE-12744.01.patch
>
>
> When using multiple insert with multiple group by, grouping__id will failed to be parse.
> hive> create temporary table testtable3 (id string, name string);
> OK
> Time taken: 1.019 seconds
> hive> create temporary table testtable2 (id string, name string);
> OK
> Time taken: 0.069 seconds
> hive> create temporary table testtable1 (id string, name string);
> OK
> Time taken: 0.066 seconds
> hive> insert into table testtable1 values ("id", "2333");
> ...
> OK
> Time taken: 32.515 seconds
> hive> from testtable1
>     > insert into table testtable2 select
>     >     id, GROUPING__ID
>     > group by id, name with cube;
> ...
> OK
> Time taken: 42.032 seconds
> hive> from testtable1
>     > insert into table testtable2 select
>     >     id, GROUPING__ID
>     > group by id, name with cube
>     > insert into table testtable3 select
>     >     id, name
>     > group by id, name grouping sets ((id), (id, name));
> FAILED: SemanticException [Error 10025]: Line 3:8 Expression not in GROUP BY key 'GROUPING__ID'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message