lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4198) Allow codecs to index term impacts
Date Fri, 02 Feb 2018 13:01:01 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350276#comment-16350276
] 

ASF subversion and git services commented on LUCENE-4198:
---------------------------------------------------------

Commit 666f93ad4f597edb1a88ef48374ac79a1c09e862 in lucene-solr's branch refs/heads/master
from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=666f93a ]

LUCENE-4198: Fix propagation of flags in SimpleTextPostingsFormat.


> Allow codecs to index term impacts
> ----------------------------------
>
>                 Key: LUCENE-4198
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4198
>             Project: Lucene - Core
>          Issue Type: Sub-task
>          Components: core/index
>            Reporter: Robert Muir
>            Priority: Major
>             Fix For: master (8.0)
>
>         Attachments: LUCENE-4198-BMW.patch, LUCENE-4198.patch, LUCENE-4198.patch, LUCENE-4198.patch,
LUCENE-4198.patch, LUCENE-4198.patch, LUCENE-4198_flush.patch, TestSimpleTextPostingsFormat.asf.nightly.master.1466.consoleText.excerpt.txt,
TestSimpleTextPostingsFormat.sarowe.jenkins.nightly.master.681.consoleText.excerpt.txt
>
>
> Subtask of LUCENE-4100.
> Thats an example of something similar to impact indexing (though, his implementation
currently stores a max for the entire term, the problem is the same).
> We can imagine other similar algorithms too: I think the codec API should be able to
support these.
> Currently it really doesnt: Stefan worked around the problem by providing a tool to 'rewrite'
your index, he passes the IndexReader and Similarity to it. But it would be better if we fixed
the codec API.
> One problem is that the Postings writer needs to have access to the Similarity. Another
problem is that it needs access to the term and collection statistics up front, rather than
after the fact.
> This might have some cost (hopefully minimal), so I'm thinking to experiment in a branch
with these changes and see if we can make it work well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message