hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-10233) Hive on LLAP: Memory manager
Date Wed, 15 Apr 2015 21:22:58 GMT

    [ https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497034#comment-14497034
] 

Siddharth Seth commented on HIVE-10233:
---------------------------------------

Looked at just the Tez Configuration changes.
- Since Hive will be setting the memory explicitly, disabling the Tez scaling makes sense.
That's done by setting
tez.task.scale.memory.enabled = false (TezConfiguration.TEZ_TASK_SCALE_MEMORY_ENABLED).
This needs to be set before creating the AM, and applies to all DAGs running in the AM.

- TezRuntimeConfiguration.TEZ_RUNTIME_IO_SORT_MB, TezRuntimeConfiguration.TEZ_RUNTIME_UNORDERED_OUTPUT_BUFFER_SIZE_MB
- need to convert the memory from bytes to MB before setting these properties
- edgeProp.getInputMemoryNeededPercent - this needs to be a fraction (0-1) (rather than an
actual percentage (0-100)). Not sure what the method gives back right now.
- Missed mentioning this in the offline discussions about the properties involved, one more
needs to be set for the Ordered case. (TEZ_RUNTIME_INPUT_POST_MERGE_BUFFER_PERCENT). This
is a measure of how much memory will be used after the merge is complete to avoid spilling
to disk. This defaults to 0, but is typically a lower value than the MergeMemory.
Given that this memory is always reserved for the Input, it can just be set to the Input merge
memory.

There's explicit APIs which can be used to configure these properties.
{code}
.setValueSerializationClass(TezBytesWritableSerialization.class.getName(), null)
.configureOutput().setSortBufferSize([OUT_SIZE]).done()
.configureInput().setShuffleBufferFraction(IN_FRACTION).setPostMergeBufferFraction(IN_FRACTION).done()
{code}

Similarly for the UnorderedCase.





> Hive on LLAP: Memory manager
> ----------------------------
>
>                 Key: HIVE-10233
>                 URL: https://issues.apache.org/jira/browse/HIVE-10233
>             Project: Hive
>          Issue Type: Bug
>          Components: Tez
>    Affects Versions: llap
>            Reporter: Vikram Dixit K
>            Assignee: Vikram Dixit K
>         Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP.patch
>
>
> We need a memory manager in llap/tez to manage the usage of memory across threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message