hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sahil Takiar (JIRA)" <>
Subject [jira] [Commented] (HIVE-19937) Intern fields in MapWork on deserialization
Date Tue, 24 Jul 2018 16:22:00 GMT


Sahil Takiar commented on HIVE-19937:

[~vihangk1] could you take a look?

> Intern fields in MapWork on deserialization
> -------------------------------------------
>                 Key: HIVE-19937
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Spark
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>         Attachments: HIVE-19937.1.patch, HIVE-19937.2.patch, HIVE-19937.3.patch, HIVE-19937.4.patch,
HIVE-19937.5.patch, post-patch-report.html, report.html
> When fixing HIVE-16395, we decided that each new Spark task should clone the {{JobConf}}
object to prevent any {{ConcurrentModificationException}} from being thrown. However, setting
this variable comes at a cost of storing a duplicate {{JobConf}} object for each Spark task.
These objects can take up a significant amount of memory, we should intern them so that Spark
tasks running in the same JVM don't store duplicate copies.

This message was sent by Atlassian JIRA

View raw message