mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <jake.man...@gmail.com>
Subject Re: Mahout - Pig Hackday
Date Thu, 03 May 2012 05:28:12 GMT
On Wed, May 2, 2012 at 9:34 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> On Wed, May 2, 2012 at 9:05 PM, Jake Mannix <jake.mannix@gmail.com> wrote:
>
> > On Wed, May 2, 2012 at 8:07 PM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
> >
> > > Making a pig module for mahout is a fine idea.  The twitter guys may
> have
> > > something better, though, so we should explore that as well.  Andy's
> > > comments make that possibility very interesting.
> > >
> >
> > What I'd want to suggest is that anyone who wants to move rapidly on
> > pig/mahout
> > integration should start a github repo which doesn't directly inject
> itself
> > into mahout,
> > but stands separately for now, but then the maven dependency DAG rears
> its
> > ugly head:
> >
> >  pig-vector depends on mahout-core
> >
> > so if we *do* want to start writing cool stuff *in mahout* which depends
> on
> > it,
> >
>
> I think that we are fine if we just create a pig module in mahout.  It can
> depend on the external stuff and mahout-core.  That would be the natural
> time and place to put the fancy pig-vector-ish stuff anyway.
>
> So I am not worried about this.  We would have separation of mahout-pig
> stuff from mahout-core-ish stuff and all should be fine.


Yeah, most likely the idea would be that mahout-pig would depend on more
than just writables, in the long run: UDF wrappers for everything we stuff
into one (a la Jimmy Lin et al's "Training a smarter pig" talk at
Hadoop World)


>  > we're circularly dependently self-destruct.  Now, if we had a proper
> > mahout-writables
> > maven module (*ahem*!), which had all the stuff pig-vector needed, and
> > mahout-core
> > depended on this, then mahout-core (or mahout-examples) could still
> depend
> > on
> > pig-vector (or something like it, like the elephant-bird-loaders slim
> dep)
> > at some
> > point.
> >
>
> I would rather not have Mahout depend on unreleased github stuff.  If it is
> good enough to depend on, it is good enough to suck into the main
> deliverable.
>

Oh I wasn't meaning core should depend on unreleased stuff, more
like the elephant-bird slimmed down module, once released.

-- 

  -jake

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message