lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Grant Ingersoll" <>
Subject RE: Adding generic payloads to a Term's posting list
Date Mon, 10 Oct 2005 14:42:38 GMT 

See item #11 of API changes.  Maybe along the lines of what you are
interested in, although I don't know if anyone has even attempted a design
of it.  I would also like to see this, plus the ability to store info at
higher levels in the Index, such as Field (not on a per token basis),
Document (info about the document that spans it's fields) and Index (such as
coreference information).  Alas, no time...


>-----Original Message-----
>From: Shane O'Sullivan [] 
>Sent: Monday, October 10, 2005 8:38 AM
>Subject: Adding generic payloads to a Term's posting list
>To the best of my knowledge, it is not possible to add generic 
>data to a Term's posting list.
>By this I mean info that is defined by the search engine, not 
>Lucene itself.
>Whereas Lucene adds some data to the posting lists, such as 
>the term's position within a document, there are many other 
>useful types of information that could be attached to a term.
>Some examples would be in XML documents, to store the depth of 
>a tag in the document, or font information, such as if the 
>term appeared in a header or in the main body of text.
>Are there any plans to add such functionality to the API? If 
>not, where would be a the appropriate place to implement these 
>changes? I presume the TermInfosWriter and TermInfosReader 
>would have to be altered, as well as the classes which call 
>them. Could this be done without having to modify the index in 
>such a way that standard Lucene indexes couldn't read it?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message