Hi Tim,

That would be awesome. We have seen some really disparate Mesos allocations for our Spark Streaming jobs. (like (7,4,1) over 3 executors for 4 kafka consumer instead of the ideal (3,3,3,3))
For network dependent consumers, achieving an even deployment would  provide a reliable and reproducible streaming job execution from the performance point of view. 
We're deploying in coarse grain mode. Not sure Spark Streaming would work well in fine-grained given the added latency to acquire a worker.

You mention that you're changing the Mesos scheduler. Is there a Jira where this job is taking place?

-kr, Gerard.


On Mon, Dec 22, 2014 at 6:01 PM, Timothy Chen <tnachen@gmail.com> wrote:
Hi Gerard,

Really nice guide!

I'm particularly interested in the Mesos scheduling side to more evenly distribute cores across cluster.

I wonder if you are using coarse grain mode or fine grain mode?

I'm making changes to the spark mesos scheduler and I think we can propose a best way to achieve what you mentioned.

Tim

Sent from my iPhone

> On Dec 22, 2014, at 8:33 AM, Gerard Maas <gerard.maas@gmail.com> wrote:
>
> Hi,
>
> After facing issues with the performance of some of our Spark Streaming
> jobs, we invested quite some effort figuring out the factors that affect
> the performance characteristics of a Streaming job. We  defined an
> empirical model that helps us reason about Streaming jobs and applied it to
> tune the jobs in order to maximize throughput.
>
> We have summarized our findings in a blog post with the intention of
> collecting feedback and hoping that it is useful to other Spark Streaming
> users facing similar issues.
>
> http://www.virdata.com/tuning-spark/
>
> Your feedback is welcome.
>
> With kind regards,
>
> Gerard.
> Data Processing Team Lead
> Virdata.com
> @maasg