hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Deepak Jaiswal (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (HIVE-19849) ReduceRecordSource should flush the last record when reader runs out of records
Date Mon, 11 Jun 2018 01:04:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-19849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Deepak Jaiswal reassigned HIVE-19849:
-------------------------------------


> ReduceRecordSource should flush the last record when reader runs out of records
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-19849
>                 URL: https://issues.apache.org/jira/browse/HIVE-19849
>             Project: Hive
>          Issue Type: Task
>            Reporter: Deepak Jaiswal
>            Assignee: Deepak Jaiswal
>            Priority: Major
>
> ReduceRecordSource pushes all the records to the reducer operator. It is upto that operator
to forward it down the pipeline. Incase of operators such as GBY, the last record is flushed
only when the operator is closed which may cause joins to miss records.
> This has been fixed for SMB Join when it happens on reducer, however, it maybe good idea
to just flush out recursively (see flushRecursive) when reader is exhausted to ensure that
last record  or set of records is not held.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message