hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mithun Radhakrishnan (JIRA)" <>
Subject [jira] [Commented] (HIVE-12734) Remove redundancy in HiveConfs serialized to UDFContext
Date Fri, 29 Sep 2017 20:58:00 GMT


Mithun Radhakrishnan commented on HIVE-12734:

I just committed these to {{master}}, {{branch-2}}, and {{branch-2.2}}. Thank you for the
contribution, [~cdrome].

This is a performance optimization for the {{HCatLoader}}/Pig path. This should work behind
the scenes, and as such, shouldn't entail a documentation update, IMHO.

> Remove redundancy in HiveConfs serialized to UDFContext
> -------------------------------------------------------
>                 Key: HIVE-12734
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.2.1, 2.0.0, 2.2.0, 3.0.0
>            Reporter: Mithun Radhakrishnan
>            Assignee: Chris Drome
>         Attachments: HIVE-12734.1.patch, HIVE-12734.2-branch-2.2.patch, HIVE-12734.2-branch-2.patch,
> {{HCatLoader}} lands up serializing one {{HiveConf}} instance per table-alias, to Pig's
{{UDFContext}}. This lands up bloating the {{UDFContext}}.
> To reduce the footprint, it makes sense to serialize a default-constructed {{HiveConf}}
once, and one "diff" per {{HCatLoader}}. This should reduce the time taken to kick off jobs
from {{pig -useHCatalog}} scripts.
> (Note_to_self: YHIVE-540).

This message was sent by Atlassian JIRA

View raw message