lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Hermann <>
Subject Re: Writing an Analyzer for storing and retrieving a payload (was: Storing additional Metadata with Fields)
Date Fri, 15 Oct 2010 19:22:35 GMT
Am Freitag, 15. Oktober 2010, 20:13:17 schrieb Erick Erickson:


> Have you seen:
> ds/

Sure. There is also

to name a few

> And I don't think payloads are added unless they're specified in the term.
> And even if they are, is your index big enough to care?

Actually i think i'll be adding the payload to every term.
And currently i'm not (yet) worrying about the size, i'll think about that 
later, i was just asking if there might be a better solution.

My current plan is to implement a TokenFilter that identifies some of the 
tokens as Payload-Bytes, reads the Paylod from there and assigns it to the 
following tokens until a new Payload-Bytes-Token is found.

So instead of only
doc.add(new Field("contents", "this is the value", ...));

i would do
byte[] payload = getCurrentPayload();
doc.add(new Field("contents", payload, ...));
doc.add(new Field("contents", "this is the value", ...));

Then in the Analyzer i can identify the payload (i.e. by the first one, two 
bytes), decode the payload and use it for further tokens.

Anything wrong with that approach?

Christoph Hermann

Christoph Hermann
Institut für Informatik
Tel: +49 761-203-8171 Fax: +49 761-203-8162

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message