nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject Re: Upgrading to Hadoop 0.22.0+
Date Tue, 13 Dec 2011 17:04:28 GMT
Hi

I did a quick test to see what happens and it won't compile. It cannot find 
our old mapred API's in 0.22. I've also tried 0.20.205.0 which compiles but 
won't run and many tests fail with stuff like.

Exception in thread "main" java.lang.NoClassDefFoundError: 
org/codehaus/jackson/map/JsonMappingException
        at 
org.apache.nutch.util.dupedb.HostDeduplicator.deduplicator(HostDeduplicator.java:421)
        at 
org.apache.nutch.util.dupedb.HostDeduplicator.run(HostDeduplicator.java:443)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at 
org.apache.nutch.util.dupedb.HostDeduplicator.main(HostDeduplicator.java:431)
Caused by: java.lang.ClassNotFoundException: 
org.codehaus.jackson.map.JsonMappingException
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
        ... 4 more

I think this can be overcome but we cannot hide from the fact that all jobs 
must be ported to the new API at some point.

You did some work on the new API's, did you come across any cumbersome issues 
when working on it?

Cheers


On Tuesday 13 December 2011 17:48:32 Andrzej Bialecki wrote:
> On 13/12/2011 17:42, Lewis John Mcgibbney wrote:
> > Hi Markus,
> > 
> > I'm certainly in agreement here. If you like to open a Jira, we can
> > begin the build up a picture of what is required.
> > 
> > Lewis
> > 
> > On Tue, Dec 13, 2011 at 4:41 PM, Markus Jelsma
> > 
> > <markus.jelsma@openindex.io>  wrote:
> >> Hi,
> >> 
> >> To keep up with the rest of the world i believe we should move from the
> >> old Hadoop mapred API to the new MapReduce API, which has already been
> >> done for the nutchgora branch. Upgrading from hadoop-core to
> >> hadoop-common is easily done in Ivy but all jobs must be tackled and we
> >> have many jobs!
> >> 
> >> Anyone to give pointers and helping hand in this large task?
> 
> I guess the question is also whether the 0.22 is compatible enough to
> compile more or less with the existing code that uses the old api. If it
> does, then we can do the transition gradually, if it doesn't then it's a
> bigger issue.
> 
> This is easy to verify - just drop in the 0.22 jars and see if it
> compiles / tests are passing.

-- 
Markus Jelsma - CTO - Openindex

Mime
View raw message