mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Brickley <>
Subject Re: Spec for a common import/export service for Mahout jobs
Date Tue, 13 Sep 2011 11:37:03 GMT
On 13 September 2011 13:13, Ted Dunning <> wrote:
> The cluster, classification and decompositional jobs all like the same kind
> of input.  These can be viewed as matrices or sequences of vectors; it comes
> to much the same sort of thing.  The gotcha is that the user often has
> tokens in fielded documents (ratings, documents, purchase history).  Other
> than that, it should be pretty easy.  Even the output of most of these
> programs can be matrices/vector sequences.

Reminds me of visual compositing tools (eg Blender,
or Quartz Composer) where the "plumbing" that connects nodes with
pipes is just guided by input/output types, ... but that sort of basic
hint can't guarantee you've not hooked things up in a way that doesn't
really make sense or do what you hope. I'd be wary of putting too
heavy expectations on input/output typing here, it's pretty healthy
that the whole pipeline can be thought of as just communicating
vectors/matrices. Rather that, than try to guarantee against screwups
by strongly typing everything.



View raw message