tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mustafa İman (Jira) <j...@apache.org>
Subject [jira] [Created] (TEZ-4253) Revert TEZ-4170
Date Wed, 25 Nov 2020 00:31:00 GMT
Mustafa İman created TEZ-4253:

             Summary: Revert TEZ-4170
                 Key: TEZ-4253
                 URL: https://issues.apache.org/jira/browse/TEZ-4253
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Mustafa İman
            Assignee: Mustafa İman

There are two performance improvements in https://issues.apache.org/jira/browse/TEZ-4170
 # Move construction of InputInitializers to background thread
 # Remove RootInputInitializerManager's thread pool and move all threads using this executor
to DagAppMaster's thread pool.

Item 1: This is an incorrect optimization which may cause data races in VertexImpl regarding
handling of events. This was mitigated in https://issues.apache.org/jira/browse/TEZ-4204 however
the solution basically reverts back the initial optimization only with a more complicated
approach. Apart from these, it unnecessarily complicates Tez application master. The scenario
where this is useful is when a custom InputInitializer constructor contains a lot of heavyweight
operations. However, the solution to this problem belongs to client application. Client application
can easily move heavyweight operations to InputInitializer#initialize method.

Item 2: The benefit of this is doubtful. Supposedly a root input initializer is offloaded
to a cached thread in appcontext instead of creating a new one in RootInputInitializerManager.
The number of threads of this pool is limited. When many root input initializer depends on
InputInitializerEvents all threads may get blocked. In that case the rest of the vertices
(which are supposed to send the InputInitializerEvents) cannot run. So we run into a deadlock.


This message was sent by Atlassian Jira

View raw message