spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Gummelt <mgumm...@mesosphere.io>
Subject Re: HDFS as Shuffle Service
Date Thu, 28 Apr 2016 00:34:14 GMT
> Are you suggesting to have shuffle service persist and fetch data with
hdfs, or skip shuffle service altogether and just write to hdfs?

Skip shuffle service altogether.  Write to HDFS.

Mesos environments tend to be multi-tenant, and running the shuffle service
on all nodes could be extremely wasteful.  If you're running a 10K node
cluster, and you'd like to run a Spark job that consumes 100 nodes, you
would have to run the shuffle service on all 10K nodes out of band of Spark
(e.g. marathon).  I'd like a solution for dynamic allocation that doesn't
require this overhead.

I'll look at SPARK-1529.

On Wed, Apr 27, 2016 at 10:24 AM, Steve Loughran <stevel@hortonworks.com>
wrote:

>
> > On 27 Apr 2016, at 04:59, Takeshi Yamamuro <linguin.m.s@gmail.com>
> wrote:
> >
> > Hi, all
> >
> > See SPARK-1529 for related discussion.
> >
> > // maropu
>
>
> I'd not seen that discussion.
>
> I'm actually curious about why the 15% diff in performance between Java
> NIO and Hadoop FS APIs, and, if it is the case (Hadoop still uses the
> pre-NIO libraries, *has anyone thought of just fixing Hadoop Local FS
> codepath?*
>
> It's not like anyone hasn't filed JIRAs on that ... it's just that nothing
> has ever got to a state where it was considered ready to adopt, where
> "ready" means: passes all unit and load tests against Linux, Unix, Windows
> filesystems. There's been some attempts, but they never quite got much
> engagement or support, especially as nio wasn't there properly until Java
> 7, —and Hadoop was stuck on java 6 support until 2015. That's no longer a
> constraint: someone could do the work, using the existing JIRAs as starting
> points.
>
>
> If someone did do this in RawLocalFS, it'd be nice if the patch also
> allowed you to turn off CRC creation and checking.
>
> That's not only part of the overhead, it means that flush() doesn't, not
> until you reach the end of a CRC32 block ... so breaking what few
> durability guarantees POSIX offers.
>
>
>
>


-- 
Michael Gummelt
Software Engineer
Mesosphere

Mime
View raw message