tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aurijoy Majumdar <aurij...@buffalo.edu>
Subject Doubts Regarding Cross Product Edge Parallelism
Date Thu, 26 Mar 2015 08:34:22 GMT

I was trying to understand the cross Product edge, i had a few doubts in
that direction:

1) Is there any context under which I can examine the Cross Product Edge's
working mechanism. I was trying to figure out a use case, any help in that
direction would be great. I believe that that this custom edge is to
facilitate the join operation, but I couldnt understand whether there would
be repartitioning in this case.

2) Why is most of data movement placed on one of these edges is it to
facilitate some locality based optimization or is it just done to avoid
duplicating same data blocks being transferred over I/O channels?
I guess some insight into the mechanism of the cross product and synthetic
cross product would clear that out.

3) How exactly is the scenario set up for affinity based scheduling for the
most used edge( which transports most of data) I guess some elaboration of
the previous two points would clear this out.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message