lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Hermann <>
Subject Writing an Analyzer for storing and retrieving a payload (was: Storing additional Metadata with Fields)
Date Fri, 15 Oct 2010 14:13:39 GMT
Am Donnerstag, 14. Oktober 2010, 14:43:41 schrieb Christoph Hermann:


> It seems Playload gets added to
> every term in the index, so in my case i would store the x,y and page
> values for every word and increase the index much more than i'd need.
> Any approach for preventing this?
> And when searching, how can i access the payloads when displaying the
> result? I haven't found information on that so far.

Is there any example on how to use payloads?
And the above questions are still valid.

My current problem is that i've written a ContentHandler, that parses the 
extended html from tika and sets boost values on created fields, but it seems 
that i need to move all this to the Analyzer since using boosts on Fields with 
the same name has no real effect?
add(new Field("contents","foo"))
add(new Field("contents","bar").setBoost(1.5f))

=> gets one "content" field with a common boost value?

If i'm correct, how would i proceed to achieve the desired effect?

Put all the HTML from the <body> (from tika) in one content field, and let the 
Analyzer do the work?

Is there an example of an Analyzer that uses playloads available somewhere?

Christoph Hermann

Christoph Hermann
Institut für Informatik
Tel: +49 761-203-8171 Fax: +49 761-203-8162

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message