spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: HDFS as Shuffle Service
Date Thu, 28 Apr 2016 08:36:46 GMT
Why would you run the shuffle service on 10K nodes but Spark executors
on just 100 nodes? wouldn't you also run that service just on the 100
nodes?

What does plumbing it through HDFS buy you in comparison? There's some
additional overhead and if anything you lose some control over
locality, in a context where I presume HDFS itself is storing data on
much more than the 100 Spark nodes.

On Thu, Apr 28, 2016 at 1:34 AM, Michael Gummelt <mgummelt@mesosphere.io> wrote:
>> Are you suggesting to have shuffle service persist and fetch data with
>> hdfs, or skip shuffle service altogether and just write to hdfs?
>
> Skip shuffle service altogether.  Write to HDFS.
>
> Mesos environments tend to be multi-tenant, and running the shuffle service
> on all nodes could be extremely wasteful.  If you're running a 10K node
> cluster, and you'd like to run a Spark job that consumes 100 nodes, you
> would have to run the shuffle service on all 10K nodes out of band of Spark
> (e.g. marathon).  I'd like a solution for dynamic allocation that doesn't
> require this overhead.
>
> I'll look at SPARK-1529.
>
> On Wed, Apr 27, 2016 at 10:24 AM, Steve Loughran <stevel@hortonworks.com>
> wrote:
>>
>>
>> > On 27 Apr 2016, at 04:59, Takeshi Yamamuro <linguin.m.s@gmail.com>
>> > wrote:
>> >
>> > Hi, all
>> >
>> > See SPARK-1529 for related discussion.
>> >
>> > // maropu
>>
>>
>> I'd not seen that discussion.
>>
>> I'm actually curious about why the 15% diff in performance between Java
>> NIO and Hadoop FS APIs, and, if it is the case (Hadoop still uses the
>> pre-NIO libraries, *has anyone thought of just fixing Hadoop Local FS
>> codepath?*
>>
>> It's not like anyone hasn't filed JIRAs on that ... it's just that nothing
>> has ever got to a state where it was considered ready to adopt, where
>> "ready" means: passes all unit and load tests against Linux, Unix, Windows
>> filesystems. There's been some attempts, but they never quite got much
>> engagement or support, especially as nio wasn't there properly until Java 7,
>> —and Hadoop was stuck on java 6 support until 2015. That's no longer a
>> constraint: someone could do the work, using the existing JIRAs as starting
>> points.
>>
>>
>> If someone did do this in RawLocalFS, it'd be nice if the patch also
>> allowed you to turn off CRC creation and checking.
>>
>> That's not only part of the overhead, it means that flush() doesn't, not
>> until you reach the end of a CRC32 block ... so breaking what few durability
>> guarantees POSIX offers.
>>
>>
>>
>
>
>
> --
> Michael Gummelt
> Software Engineer
> Mesosphere

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message