tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mustafa Iman (Jira)" <j...@apache.org>
Subject [jira] [Created] (TEZ-4137) Input/Output processors should merge payload to local conf
Date Wed, 01 Apr 2020 06:54:00 GMT
Mustafa Iman created TEZ-4137:

             Summary: Input/Output processors should merge payload to local conf
                 Key: TEZ-4137
                 URL: https://issues.apache.org/jira/browse/TEZ-4137
             Project: Apache Tez
          Issue Type: Improvement
            Reporter: Mustafa Iman
            Assignee: Mustafa Iman

This patch introduces config merging to various Input and Output processors. As described
in https://issues.apache.org/jira/browse/TEZ-4073 , we need to reduce the size of the configuration
objects transferred over the wire. There are two improvements we are planning to do regarding
to that:
 # Skip sending default configs and configuration coming from xml files in payload
 # Send dag, vertex and session configurations in layers instead of sending dag + vertex +
session configs all together three times.

In order to achieve these,
 * We need to expose local config coming from configuration files to TaskContext.
 * Input/Output processors must merge the config from user payload to local config in their

This is the configuration merging part. After this is merged, corresponding changes should
be made on Hive side to prevent sending redundant configs. Until Hive side is updated, changes
here are only overhead because all the config objects are the same and they have all the config
options anyway.

This message was sent by Atlassian Jira

View raw message