lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Hermann <>
Subject Storing additional Metadata with Fields
Date Thu, 14 Oct 2010 10:17:23 GMT

is there a way to store additional metadata with fields?

My Problem is as follows:
I'm extracting extended html with tika. This extended html contains references 
to pages, x,y values of the text etc. I want to be able to retrieve those 
values when text was found while searching.

So when creating the Document, i'm storing a Field for every part of the texts 
content of the document i'm currently indexing (lets call it "content").

I have the following content:
<span page="1" x="1", y="1">This is a very</span>
<span page="1" x="1", y="2">interesting text.</span>
<span page="2" x="1", y="1">This is boring text</span>

So i would store the following:

doc.add(new Field("content", "This is a very", Field.Store.YES, 
doc.add(new Field("content", "interesting text", Field.Store.YES, 
doc.add(new Field("content", "This is boring text", Field.Store.YES, 

Is there any way to include the page,x,y values in there?
I'd like to display the page when retrieving the results.

I thought about storing the same field twice and adding the page,x,y values at 
the beginning of the Field and then when retrieving the field extract those 
values, but maybe theres a better way?

Christoph Hermann

Christoph Hermann
Institut für Informatik
Tel: +49 761-203-8171 Fax: +49 761-203-8162

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message