nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Jelsma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-973) Remove Segment Merger in 1.3
Date Fri, 01 Apr 2011 14:05:06 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014566#comment-13014566
] 

Markus Jelsma commented on NUTCH-973:
-------------------------------------

I'm not sure we should. In 1.x fetches still generate loose segments which i just might want
to merge for simplicity. For example, i have a test setup where multiple segments are generated
each day, then they get merged so i have a single segment per day. It's a bit easier to maintain
and reindex fewer segments. I do delete segments older than the fetch interval but in my case
that's just deleting one segment a day. This is in a local environment.

> Remove Segment Merger in 1.3
> ----------------------------
>
>                 Key: NUTCH-973
>                 URL: https://issues.apache.org/jira/browse/NUTCH-973
>             Project: Nutch
>          Issue Type: Task
>            Reporter: Julien Nioche
>            Priority: Minor
>             Fix For: 1.3
>
>
> The code for the segment merging is still in 1.3, as far as I understand its original
function it was mostly useful for having a single data structure where the search app could
get the cached data from. Now that we've delegated the indexing and search to SOLR we don't
really need to worry about the cache anymore. Would it make sense to purge it or do you guys
think it would still be useful? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message