spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Vanzin <>
Subject Re: YARN Shuffle service and its compatibility
Date Mon, 18 Apr 2016 22:05:04 GMT
On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <> wrote:
> The bigger problem is that it is much easier to maintain backward
> compatibility rather than dictating forward compatibility. For example, as
> Marcin said, if we come up with a slightly different shuffle layout to
> improve shuffle performance, we wouldn't be able to do that if we want to
> allow Spark 1.6 shuffle service to read something generated by Spark 2.1.

And I think that's really what Mark is proposing. Basically, "don't
intentionally break backwards compatibility unless it's really
required" (e.g. SPARK-12130). That would allow option B to work.

If a new shuffle manager is created, then neither option A nor option
B would really work. Moving all the shuffle-related classes to a
different package, to support option A, would be really messy. At that
point, you're better off maintaining the new shuffle service outside
of YARN, which is rather messy too.

The best would be if the shuffle service didn't really need to
understand the shuffle manager, and could find files regardless; I'm
not sure how feasible that is, though.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message