lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Elschot (JIRA)" <>
Subject [jira] Commented: (LUCENE-2250) Index positions in document term vectors only
Date Fri, 05 Feb 2010 16:17:27 GMT


Paul Elschot commented on LUCENE-2250:

Basically this involves transposing/reordering the current prx file from the current term/docs/posns
to doc/terms/posns.
The biggest advantage would be in searches that use proximity: since all positions within
a document are used at the same time, no long seeks would be necessary to get the positions
for scoring a single document.

For small documents, the cost of this could be that in the current order the number of seeks
might be no bigger than the number of terms, whereas with the order proposed here the number
of seeks would be no bigger than the number of documents containing all terms.

Other than that, there are these advantages:
- adding/deleting a doc to/from the prx file is a lot simpler, and
- document term vectors with positions can be taken directly from the prx file.

Has this been tried/discussed before?

> Index positions in document term vectors only
> ---------------------------------------------
>                 Key: LUCENE-2250
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Wish
>          Components: Index
>            Reporter: Paul Elschot
> For searching with positions this might reduce the number of (longer) seeks to one per
document containing all terms.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message