lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Condit <>
Subject RE: recovering payload from fields
Date Fri, 26 Feb 2010 21:05:45 GMT
Hi Chris-
> To my knoweldge, the character position of the tokens is not preserved by
> Lucene - only the ordinal postion of token's within a document / field is
> preserved.  Thus you need to store this character offset information
> separately, say, as Payload data.

Thanks for the information. So adding the OffsetAttribute at index time doesn't embed the
offset information in the index - it just makes it available to the TokenFilter? I'll try
adding the offset from the attribute to the payload..

In terms of getting access to the payloads is the best way to reconstruct the token stream
(as the Highlighter does)? Or is than an easier way to just get access to the payloads?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message