gora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kirby Bohling <kirby.bohl...@gmail.com>
Subject Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]
Date Tue, 09 Aug 2011 15:31:04 GMT

On Tue, Aug 9, 2011 at 10:10 AM, Julien Nioche <
lists.digitalpebble@gmail.com> wrote:

> Hi Kirby,
> Grumble, Grumble.  (adding dev@nutch, as that is more than likely
>> where this discussion really belongs)...
> am adding gora-dev@incubator.apache.org as well
>> It'd be really nice if folks could just follow the commands in the
>> nightly build, and get a build pushed out.  I've pointed this out
>> previously, and was told this would be fixed "shortly" (right after
>> GORA-0.1 finally got released, but not published in public maven repo,
>> which as far as I know, it still isn't published, but I stopped
>> checking on it).
> I understand and share your frustration, however you need to bear in mind
> that things are done only if people volunteer and have time - usually taken
> from their holiday, weekends, evenings. Chris (who is the de facto release
> master for Nutch and Gora) has not had the time and nobody else has
> volunteered to do it.

   I don't mean to be a complainer, I'd happily try and contribute fixes on
this one, but most of this would likely have to be done on Hudson/Jenkins.
I think you're addressing a larger issue than I really meant.  My point was,
somehow a developer does a build on their desktop, and however that is done
should be duplicated on Hudson/Jenkins.  If you need the trunk of gora, then
is it possible to checkout it out, build it and install it to a local repo,
and then build Nutch via Hudson/Jenkins?  Whatever it takes to get a build
should be what the CI server is doing.  The repeatable, but failing builds
is what really confuses and frustrates me.  The nightly/CI build should be
automating what devs on their desktop to ensure it'll work on a clean
setup.  Right now, it just tells you that for the last year, the totally
obvious steps will lead to a failure.

   I can figure out all of the configuration issues for Hudson/Jenkins to
make it work, if somebody can push that into the Apache version.  However, I
think answering your questions first would be a good idea.  My totally
non-binding +1 for setting up a CI/Nightly build for the various stable
branches too, the only one I found on Apache was for trunk.

>> As it happens, yesterday was the 1 year anniversary of the last
>> successful Hudson/Jenkins build...  If that actually worked, we could
>> point people towards it as a useful recipe for how to get a build
>> working off trunk.  I haven't been following Nutch too closely, but it
>> always strikes me as really odd, that there's a nightly build and it
>> doesn't bother anybody that it fails all the time (and that there
>> isn't a nightly build for the stable branches).
> The real issue behind all this is what we should do with Nutch 2.0. What
> follows is only my opinion and I would love to hear what others have to say
> on this subject.
> Since we (actually mostly Dogacan) wrote 2.0 and delegated the storage to
> Gora, the latter hasn't really taken off since incubation. There have been
> some modest contributions to it but it does not seem to be used much and
> there is virtually nothing happening on it in terms of development. More
> worryingly, the people who initially contributed to it are not very active
> on the project (such is life, new jobs, different projects, etc...)
> anymore·. As for Nutch 2.0, it hasn't made any progress in  the last 12
> months : we still have the same bugs, the tests do not work, the build has
> to be done manually etc...
> At the same time, there has been a new lease of life into Nutch as a whole
> : there is definitely more activity on the mailing lists, new users, new
> active committers  etc... and quite a few bugfixes and improvements - most
> of them backported from what had been done in the trunk and people seem
> fairly happy with what we can do with 1.4
> So the question is : what shall we do with 2.0? Here are a few
> possibilities :
> a) put some effort into it, fix the bugs and make so that it can be used
> instead of 1.x
> b) shelve it and leave it for enthusiasts to play with + make 1.x the trunk
> again
> c) do nothing : keep 2.0 and 1.x in parallel  (but having to maintain two
> branches is quite a pain)
> d) abandon the idea of a neutral storage layer with Gora and hardwire it to
> e.g. HBase
> Option (a) has not happened in the last 12 months and I am not very hopeful
> about it.
> What do you guys think?

   I know nothing about the 2.0 branch, and can't really contribute to that
conversation (that job issue interferes will all my free time).


> Julien
> --
> *
> *Open Source Solutions for Text Engineering
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message