nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Emmanuel Joke (JIRA)" <j...@apache.org>
Subject [jira] Commented: (NUTCH-613) Empty Summaries and Cached Pages
Date Sun, 24 Feb 2008 07:06:14 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571874#action_12571874
] 

Emmanuel Joke commented on NUTCH-613:
-------------------------------------

I have the same analysis.  I just change my local code to store the repUrl in the "orig" and
the urlString in url and now the pb is solved. 

> Empty Summaries and Cached Pages
> --------------------------------
>
>                 Key: NUTCH-613
>                 URL: https://issues.apache.org/jira/browse/NUTCH-613
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher, searcher, web gui
>    Affects Versions: 0.9.0
>         Environment: All
>            Reporter: Dennis Kubes
>            Assignee: Dennis Kubes
>             Fix For: 0.9.0, 1.0.0
>
>         Attachments: NUTCH-613-1-20080219.patch
>
>
> There is a bug where some search results do not have summaries and viewing their cached
pages causes a NullPointer.  This bug is due to redirects getting stored under the new url
and the getURL method of FetchedSegments getting the wrong (old) url which is stored in crawldb
but has no content or parse objects.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message