tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ". Anupam" <anu...@microsoft.com.INVALID>
Subject RE: Null EdgeManager deadlock
Date Tue, 07 Aug 2018 20:28:32 GMT
Moving to dev@

From: Adrian Nicoara
Sent: Monday, August 6, 2018 7:13 PM
To: user@tez.apache.org; Hitesh Sharma <hitesh@microsoft.com>; . Anupam <anupam@microsoft.com>
Subject: RE: Null EdgeManager deadlock

After a bit trial and error, it seems that the condition for all edges to be initialized is
needed, because:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/Edge.java#L337-L338<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-dag%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapp%2Fdag%2Fimpl%2FEdge.java%23L337-L338&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785408640&sdata=bNesi0N%2FifgtgRuT3ZwGkk5ZSN%2FCAOf2haA%2BiVwSapA%3D&reserved=0>

I fixed my scenario by having B listen for VertexState.PARALLELISM_UPDATED event, as that
will always be sent by A (in my scenario):
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L1868<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-dag%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapp%2Fdag%2Fimpl%2FVertexImpl.java%23L1868&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785418653&sdata=QMxaOnYksm%2FGBn7hEotAla0TRpxEMkqkb6VUtZtRMfY%3D&reserved=0>
and it is equivalent, for my purposes, with A being configured.
Sorry for the noise.

One last question - if B can configure the edge managers at any time after tasks in A have
run, is there code that asserts that the new EdgeManager will align the number of outputs
it specifies for source tasks in A, with what the old EdgeManager used to have, and hence
align with the output spec that tasks in A used for running?

From: Adrian Nicoara
Sent: Monday, August 6, 2018 5:52 PM
To: 'user@tez.apache.org' <user@tez.apache.org<mailto:user@tez.apache.org>>; Hitesh
Sharma <hitesh@microsoft.com<mailto:hitesh@microsoft.com>>; . Anupam <anupam@microsoft.com<mailto:anupam@microsoft.com>>
Subject: Null EdgeManager deadlock

Hello,

Consider the following graph:
A ---- E ----> B
With source vertex A connected to destination vertex B through edge E.

Edge E is defined through:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-api/src/main/java/org/apache/tez/dag/api/EdgeProperty.java#L136-L156<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-api%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapi%2FEdgeProperty.java%23L136-L156&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785418653&sdata=MT4empk987jWCLSPNMo0SPnvKaG4R6dwfIoqUrwblmk%3D&reserved=0>
with a null EdgeManagerPluginDescriptor, to have the edge manager setup at runtime.

Assume that both vertex A and vertex B have a custom VertexManager, and need to be configured
at runtime.
Vertex B waits for vertex A to be configured, before attempting to configure itself. As part
of its configuration, vertex B will use:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-api/src/main/java/org/apache/tez/dag/api/VertexManagerPluginContext.java#L172-L208<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-api%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapi%2FVertexManagerPluginContext.java%23L172-L208&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785428662&sdata=ADBysH0FvuG9ai07OofeTQtSZgIiTFqG0VfkP1sb15M%3D&reserved=0>
to create the edge manager for E amongst other things.

However, because of the following code:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2898-L2909<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-dag%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapp%2Fdag%2Fimpl%2FVertexImpl.java%23L2898-L2909&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785438670&sdata=3k2kHtgQ3xDN%2BM8ShVBW72PLH9wKRrQqX7N8a2FtJNQ%3D&reserved=0>
which adds edge E to the set of uninitialized edges, when vertex A has been reconfigured,
then it's call to:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-api/src/main/java/org/apache/tez/dag/api/VertexManagerPluginContext.java#L349-L353<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-api%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapi%2FVertexManagerPluginContext.java%23L349-L353&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785438670&sdata=bghksLWl3uIQG5EmPUE6DLg9QvAAHmREsW2KGI8ksc8%3D&reserved=0>
will end up not sending any Vertexstate.CONFIGURE updates:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L1932<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-dag%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapp%2Fdag%2Fimpl%2FVertexImpl.java%23L1932&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785448683&sdata=qXw%2BioBOwhA3O99fdeWsFzwUjWDeAdlHmHFmwqzxlhM%3D&reserved=0>
because canInitVertex returns false, as vertex.uninitializedEdges is not empty:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L3172<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-dag%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapp%2Fdag%2Fimpl%2FVertexImpl.java%23L3172&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785458687&sdata=JxzNxqDxDty9GFRW3xNJrUo2XOhO7wndsxo%2BdLc6MnE%3D&reserved=0>

Now, vertex A has no way of creating an edge manager for edge E, so that edge will never be
initialized - only vertex B does that initialization.
This ends up in a deadlock, as vertex B never gets the notification that vertex A has been
configured.

What is then the right way of constructing this graph.
Do uninitialized target edges need to be monitored in source vertices at all? i.e. is this
code needed:
https://github.com/apache/tez/blob/fe22f3276d6d97f6b5dfab24490ee2ca32bf73c3/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexImpl.java#L2898-L2909<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Ftez%2Fblob%2Ffe22f3276d6d97f6b5dfab24490ee2ca32bf73c3%2Ftez-dag%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Ftez%2Fdag%2Fapp%2Fdag%2Fimpl%2FVertexImpl.java%23L2898-L2909&data=02%7C01%7C%7Cd754f5d0b4864168e10008d5fc0b4b2b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636692047785458687&sdata=dsu8FuIMLHWdN8d8nv4QbiSoNSXi1m72dTtRZ6oYNHk%3D&reserved=0>

Thank you for any insight.
Adrian

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message