Hi,
I fully agree. Accumulo looks cool, but at least I don't have any
experience with it. Besides, the HBase dependency disaster is still
fresh in my mind. 50+ dependencies, with several of them being
incompatible with Hadoop and Crunch and no way to be sure it actually
works.
BTW: Do we have someone on the team who could help us solve the HBase
issues?
Regards,
Matthias
On Wednesday, 2012-10-31, Josh Wills wrote:
> +crunch-user, to see if we have any lurking accumulo users
>
> Hey Anthony,
>
> I don't think that we have much Accumulo experience yet among the
> committers, so I'm hesitant to add a crunch-accumulo subproject w/o having
> someone on the team who is dedicated to maintaining it. If you have stuff
> you want to open source on github, we would be happy to link to it on the
> Crunch homepage (something we should do for crunchR, come to think of it),
> and we're all very happy to work together on bug fixes and new features to
> support your use cases. Ideally, we would all work together for awhile and
> get to like working with each other, and then you would join the committers
> and own the submodule.
>
> I'll let other folks weigh in, but that's my two cents.
>
> J
>
>
> On Wed, Oct 31, 2012 at 7:29 AM, Anthony Fox <adf-accdev@ccri.com> wrote:
>
> > Hi all,
> >
> > I've started exploring Apache Crunch for use in developing some analytics
> > on top of the Apache Accumulo column family store. So far, it looks very
> > promising. I've implemented the source and sink and exposed tables through
> > the scrunch repl. Being able to interactively define and submit map/reduce
> > jobs from the repl will make developing new analytics much easier. There
> > are some enhancements that I'll need to put together to support some of my
> > analytical workflows. Much of this effort can be abstracted and applied to
> > the HBase support as well. If this is of interest to anyone, I'd be happy
> > to contribute back to the crunch project. Let me know.
> >
> > Thanks,
> > Anthony
> >
|