mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris K Wensel <>
Subject Re: Has anyone tried Spark with Mahout?
Date Tue, 01 Nov 2011 15:35:02 GMT
I've made a few comments on the differences here.


On Oct 31, 2011, at 2:44 PM, Ted Dunning wrote:

> +Chris Wensel
> The biggest difference between Cascading and Plume/Crunch/FlumeJava is that the latter
all do more lazy evaluation and more program restructuring and much less large scale scheduling.
 Certainly the PCFJ group do much more to make the results look like a java collection and
are better at talking to conventional java types.
> I think that Cascading could do the more extensive job graph rewrites.  It would be hard
for Cascading to generalize its data structures, though without major backward compatibility
> In sum, I think that the difference between Cascading and PCFJ is largely a matter of
taste, not inherent system design.
> On Mon, Oct 31, 2011 at 2:36 PM, Charles Earl <> wrote:
> Thanks. This is an insightful discussion. Having just glanced now at both Plume and Crunch
these seem similar to Cascading in the sense of being dataflow languages. I wonder are you
able to comment on if there are important distinctions.

Chris K Wensel

-- Concurrent, Inc. offers mentoring, support for Cascading

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message