lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: PostingsHighlighter and analysis
Date Mon, 17 Jun 2013 11:18:28 GMT
Hi,

Any intelligent suggestions for this issue?

Thanks,
Markus 
 
-----Original message-----
> From:Trey Hyde <thyde@centraldesktop.com>
> Sent: Mon 11-Mar-2013 21:44
> To: solr-user@lucene.apache.org
> Subject: PostingsHighlighter and analysis
> 
> debug=timing has told me for a very long time that 99% of my query time for slow queries
is in the highlighting component so I've been eagerly awaiting the postingshighlighter for
quite some time.  Mean query times 50ms or less, with certain queries able to generate >
30s worth of highlighting.    Now that it's here I've been somewhat disappointed since I can't
use it since so many common analyzers emit tokens out of order, which, apparently is not compatible
with storeOffsetsWithPositions.
> 
> The only analyzer that is in the "bad" list according to LUCENE-4641 that is really critical
to our searches is the WordDelimiter filer.    
> 
> My current index time filter config (which I believe has bee unchanged for me for 5+
years):
>  <filter class="solr.WordDelimiterFilterFactory" splitOnCaseChange="1" generateWordParts="1"
>                         generateNumberParts="1" catenateWords="1" catenateNumbers="1"
catenateAll="0"/>
> 
> Does anyone have any suggestions deal with this?   Perhaps limiting certain options will
always produce tokens in order?
> 
> Thanks
> 
> Trey Hyde 
> Director of Engineering
> Email thyde@centraldesktop.com
> 
> Central Desktop. Work together in ways you never thought possible. 
> Connect with us   Website  |  Twitter  |  Facebook  |  LinkedIn  |  Google+  |  Blog

> 
> 
> 

Mime
View raw message