nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doğacan Güney (JIRA) <j...@apache.org>
Subject [jira] Commented: (NUTCH-662) Upgrade Nutch to use Lucene 2.4
Date Sun, 23 Nov 2008 09:41:46 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650007#action_12650007
] 

Doğacan Güney commented on NUTCH-662:
-------------------------------------

> +1 on moving to 2.4 anyway. Regarding the patch: I think this is a viable solution for
now. Performance-wise the impact of local buffering, 
> especially in case of large indexes, could be significant - the indexing may take much
longer with this change. 

I think this is only a problem with updating old indexes to new format. During indexing (in
Indexer.OutputFormat) we write index to a local file first anyway so seeking should not be
a problem... Or am I missing something here?

> Upgrade Nutch to use Lucene 2.4
> -------------------------------
>
>                 Key: NUTCH-662
>                 URL: https://issues.apache.org/jira/browse/NUTCH-662
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>         Environment: All
>            Reporter: Dennis Kubes
>            Assignee: Dennis Kubes
>             Fix For: 1.0.0
>
>         Attachments: lucene-analyzers-2.4.0.jar, lucene-core-2.4.0.jar, lucene-misc-2.4.0.jar,
NUTCH-662-20081121-1.patch
>
>
> Upgrade nutch to use Lucene 2.4.  This release changes the lucene file format.  New indexes
created by this lucene version will NOT be readable by older versions.  Lucene 2.4 can read
and update older index formats although updating an older format will convert it to the new
format.  There are also some performance and functionality improvments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message