lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerry Stern <>
Subject Re: Highlight the searched word when full-text searching performed
Date Mon, 28 Nov 2005 07:45:22 GMT
  Yes you're right. Highlight all hits at one time may cause problems. A hits paging method
is needed to avoid this.
  Another, if we read the contents of the original file into a string, passing it to the highlighter
at the searching stage, this also could cause problems when large original file met. I think
we can use a random access method to reduce the string size by locating the searched word,
and this may be a general problem needing to be solved.

Erik Hatcher <> wrote:
On 27 Nov 2005, at 00:24, Jerry Stern wrote:
> I wonder how to highlight the searched word when full-text 
> searching performed based on Lucene.
> At the indexing stage, the contents of a original file is 
> regarded as a FIELD of a Lucene document:
> private static void indexFile(File file, IndexWriter idxWriter)
> throws IOException {
> System.out.print("Indexing " + file.getCanonicalPath() + " ......");
> Document doc = new Document();
> doc.add(Field.Text("path", file.getAbsolutePath()));
> doc.add(Field.Text("contents", new FileReader(file)));
> idxWriter.addDocument(doc);
> System.out.println("indexed.");
> }
> At the searching stage:
> Highlighter highlighter = new Highlighter(new QueryScorer(query));
> for (int i = 0; i < hits.length(); i++)
> {
> String text = hits.doc(i).get("contents"); // the text = null.
> TokenStream tokenStream = analyzer.tokenStream("path",
> new StringReader(text));
> // Get 3 best fragments and seperate with a "..."
> String result = highlighter.getBestFragments(tokenStream,
> text, 3, "...");
> System.out.println(result);
> }
> The 'contents' field is not stored in index file, and it is not 
> reasonable to store it in index file. So the red line of code can 
> not get the 'contents' field from the index file.
> I think that the 'text' parameter for the 
> Highlighter.getBestFragments(..) method must be the context string 
> of the searched word. So my question is how can I get the context 
> string of the searched word?

In your case, you'll need to get the "path" field (since that is 
being stored) and then load the file into a String to pass to the 
highlighter. The text to highlight must be stored somewhere, and in 
your case it is on the filesystem only.

Be carefuly - you're highlighting all hits there, which will have 
issues if you get a lot of hits.


To unsubscribe, e-mail:
For additional commands, e-mail:


Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message