nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julien Nioche (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (NUTCH-1100) SolrDedup broken
Date Mon, 11 Nov 2013 16:02:19 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Julien Nioche resolved NUTCH-1100.
----------------------------------

    Resolution: Fixed

Committed revision 1540758.

We'll probably move to a more generic approach in NUTCH-656 but in the meantime this is a
good patch to have.

Thanks!

> SolrDedup broken
> ----------------
>
>                 Key: NUTCH-1100
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1100
>             Project: Nutch
>          Issue Type: Bug
>          Components: indexer
>    Affects Versions: 1.4
>            Reporter: Markus Jelsma
>             Fix For: 1.9
>
>         Attachments: NUTCH-1100-1.6-1.patch
>
>
> Some Solr indices are unable to be deduped from Nutch. For unknown reasons Nutch will
throw the exception below. There are no peculiarities to be found in the Solr logs, the queries
are normal and seem to succeed.
> {code}
> java.lang.NullPointerException
>         at org.apache.hadoop.io.Text.encode(Text.java:388)
>         at org.apache.hadoop.io.Text.set(Text.java:178)
>         at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:272)
>         at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:243)
>         at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
>         at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message