Hi all,
I have the following 2 indexed data for the field, title_t_en:
"\"War and Peace\" by \"Leo Tolstoy\"
\"Three sisters" by \"Anton Chekhov\""
I am searching by : +((title_t_en:war) (title_t_en:sister))
For every found doc's index *value* the following code is called:
SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter();
QueryScorer queryScorer = new QueryScorer(luceneQuery);
Highlighter highlighter = new Highlighter(htmlFormatter, queryScorer);
SimpleSpanFragmenter fragmenter = new SimpleSpanFragmenter(queryScorer,
*5)*;
String bestFragments = highlighter.getBestFragments(tokenStream, *value*,
*3,*FRAGMENT_DELIMITER );
The code produces the following bestFragments for found values:
"\"<B>War</B> and Peace\" by \"Leo Tolstoy\""
"\"Three <B>sisters</B>\" by \"Anton Chekhov\""
Question:
Why does bestFragments contain more then 5 bytes?
Should the getBestFragments() return 3 fragments with
delimiters , where each fragment does not exceed 5 bytes?
Regards,
Vlad
|