lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chuck Williams <>
Subject Prefix and general wildcards
Date Fri, 09 Jun 2006 17:50:42 GMT
Hi all,

I need to support query expressions like *xyz and possibly *lmn*.  The
former can be done with high search efficiency by storing (delimited)
reversed tokens and the latter by storing all (delimited) rotations for
each token.  However, both of these techniques have high index overhead,
the rotations being considerably worse than just the reversals.  In
principle, nothing is needed for the reversed or rotated tokens others
than the tokens themselves as their position and term vector information
is the same as the base token.

Have others found a better solution for this?

If not, it occurs to me that one simple and substantial optimization is
to support a token filter for term vectors, i.e. pass tokens through an
additional filter for addition to term vectors.  Unless there is a
better solution, I'll post such a patch.

Thanks for any advice,


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message