nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-2354) Upgrade Hadoop dependencies to 2.7.3
Date Fri, 15 Dec 2017 14:45:00 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292630#comment-16292630
] 

ASF GitHub Bot commented on NUTCH-2354:
---------------------------------------

sebastian-nagel opened a new pull request #261: NUTCH-2354 Upgrade Hadoop dependencies to
2.7.4
URL: https://github.com/apache/nutch/pull/261
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Upgrade Hadoop dependencies to 2.7.3
> ------------------------------------
>
>                 Key: NUTCH-2354
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2354
>             Project: Nutch
>          Issue Type: Bug
>          Components: injector
>    Affects Versions: 1.12
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>            Priority: Blocker
>             Fix For: 1.14
>
>
> This wednesday we experienced trouble running the 1.12 injector on Hadoop 2.7.3. We operated
2.7.2 before and we had no trouble running a job.
> {code}
> 2017-01-18 15:36:53,005 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running
child : java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.Counter,
but class was expected
> 	at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:216)
> 	at org.apache.nutch.crawl.Injector$InjectMapper.map(Injector.java:100)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.Counter,
but class was expected
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:383)
>         at org.apache.nutch.crawl.Injector.run(Injector.java:467)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.nutch.crawl.Injector.main(Injector.java:441)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}
> Our processes retried injecting for a few minutes until we manually shut it down. Meanwhile
on HDFS, our CrawlDB was gone, thanks for snapshots and/or backups we could restore it, so
enable those if you haven't done so yet.
> These freak Hadoop errors can be notoriously difficult to debug but it seems we are in
luck, recompile Nutch with Hadoop 2.7.3 instead 2.4.0. You are also in luck if your job file
uses the old org.hadoop.mapred.* API, only jobs using the org.hadoop.mapreduce.* API seem
to fail.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message