lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Lalevée <>
Subject Re: Payloads
Date Mon, 08 Jan 2007 10:53:11 GMT
Le Mercredi 3 Janvier 2007 14:46, Nadav Har'El a écrit :
> On Wed, Dec 20, 2006, Michael Busch wrote about "Payloads":
> >..
> > Some weeks ago I started working on an improved design which I would
> > like to propose now. The new design simplifies the API extensions (the
> > Field API remains unchanged) and uses less disk space in most use cases.
> > Now there are only two classes that get new methods:
> > - Token.setPayload()
> >  Use this method to add arbitrary metadata to a Token in the form of a
> > byte[] array.
> >...
> Hi Michael,
> For some uses (e.g., faceted search), one wants to add a payload to each
> document, not per position for some text field. In the faceted search
> example, we could use payloads to encode the list of facets that each
> document belongs to. For this, with the old API, you could have added a
> fixed term to an untokenized field, add add a payload to that entire
> untokenized field.
> With the new API, it seems doing this is much more difficult and requires
> writing some sort of new Analyzer - one that will do the regular analysis
> that I want for the regulr fields, and add the payload to the one specific
> field that lists the facets.
> Am I understanding correctly? Or am I missing a better way to do this?

I have looked closer to how lucene index, and I realized that for the facet 
feature, the kind of payload handling by Michael's patch are not designed for 
that. In this patch, the payloads are in the posting, ie in the tis, frq, prx 
files. Payload at the document level, that would be accessed in a scorer, 
should be better in the TermVector files, which are ordered by docs and not 
by term.

Solutions & Technologies
Tel : +33 (0)5 61 00 52 90
Fax : +33 (0)5 61 00 51 46

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message