lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karsten F." <>
Subject Re: highlighter / fragmenter performance for large fields
Date Thu, 16 Oct 2008 10:02:42 GMT

Hi Brian,

I don't know the internals of highlighting („explanation“) in lucene.
But I know that XTF (
) can handle very large documents (above 100 Mbyte) with highlighting very
fast. The difference to your approach is, that xtf devide the document in
small (overlapping) chunks and store the original text as xml separately
with connection to lucene indexed fields via numbered xml-nodes.
For large texts (above 200 KByte), it is the best tool I know.

Best regards

Beard, Brian wrote:
> We index some documents which have an "all" field containing all of the
> data which can be searched on.
> One of the problems we're having is when this field is say 10Mbytes the
> highlighter takes about a second to calculate the best fragments. The
> search only takes 30 milliseconds. I've accomodated the load time for
> the text which is about 5-10X faster in general, so 0.1-0.2 seconds for
> loading text from the document, and the other 0.8-0.9 performing
> highlighting.

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message