1. The performance is based on your hardware and system configurations, you can test it yourself. In my test, the two shuffle implementations have no special performance difference in latest version.
2. That’s correct to turn on netty based shuffle, and there’s no shuffle fetch related metrics in netty based shuffle, so you may not see the shuffle fetch related metrics in web portal.
From: onpoq l [mailto:email@example.com]
Sent: Thursday, June 12, 2014 2:35 PM
Subject: shuffling using netty in spark streaming
1. Does netty perform better than the basic method for shuffling? I found the latency caused by shuffling in a streaming job is not stable with the basic method.
2. However, after I turn on netty for shuffling, I can only see the results for the first two batches, and then no output at all. I'm not sure whether the way I turn on netty is correct:
val conf = new SparkConf().set("spark.shuffle.use.netty", "true")