tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hitesh Sharma <hit...@microsoft.com.INVALID>
Subject Re: Custom routing in EdgeManager
Date Wed, 20 Sep 2017 03:47:14 GMT
Thanks Gopal. Yes, rack/pod aggregates is something we are thinking about. I will go through
the discussion on that JIRA to understand things a little better.

I have implemented/overriden routeCompositeDataMovementEventToDestination but it isn't getting
called. I'm raising DataMovementEvents though (and not composite ones), so it might be expected?

If I set breakpoints in all the flavors of routeDataMovementEvent* in the ScatterGatherEdgeManager
then I only see routeDataMovementEventToDestination(int sourceTaskIndex, int sourceOutputIndex,
int destinationTaskIndex) getting called. This makes me wonder the following:

- Difference between the overloads of routeDataMovementEventDestination (is any of them depreciated?)
- Difference between EdgeManagerPlugin and EdgeManagerPluginOnDemand
- Are there any scale advantages of using CompositeDataMovementEvent vs DataMovementEvent?
My naive understanding says that the former is more of a convenience thing and from a scale
point of view there maybe no difference.


From: Gopal Vijayaraghavan <gopalv@apache.org>
Sent: Tuesday, September 19, 2017 7:59 PM
To: dev@tez.apache.org
Subject: Re: Custom routing in EdgeManager

> For instance I want to say that DataMovementEvent(s) from the tasks in the source vertex
should be routed to the tasks in the destination  vertex based on the fact whether the tasks
are in the same rack or not (or for that matter use some other key to route events between
the tasks in the two stages).

There was an attempt at scheduling a combiner task in rack-local to speed up dedup ops (by
doing per-rack aggregates) - https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FTEZ-145&data=02%7C01%7Chitesh%40microsoft.com%7C658cb716eee3477e310c08d4ffd39634%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636414731598994685&sdata=xtkU74%2Br5GoajdL2TK%2B1Q0LVbZA2UM5M7Z1xvWA48fs%3D&reserved=0

I'm wondering if you're trying to do something similar.

> To do this I implemented my own EdgeManagerPluginOnDemand derivative but I see it has
two APIs  for routing the events:

I think you might not have overridden routeCompositeDataMovementEventToDestination().

If you want to submit a patch with additional log lines to the Tez Edge.java, I think that
might be one place which is under-logged for these cases.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message