lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Calderon <calderon....@gmail.com>
Subject highlighter issue
Date Fri, 02 Apr 2010 20:13:00 GMT
hello *, i have a field that is indexing the string "the
ex-girlfriend" as these tokens: [the, exgirlfriend, ex, girlfriend]
then they are passed to the edgengram filter, this allows me to match
different user spellings and allows for partial highlighting, however
a token like 'ex' would get generated twice which should be fine
except the highlighter seems to highlight that token twice even though
it has the same offsets (4,6)

is there away to make the highlighter not highlight the same token
twice, or do i have to create a token filter that would dump tokens
with equal text and offsets ?


basically whats happening now is if i search

'the e', i get:
'<em>Seinfeld</em>The <em>E</em><em>E</em>x-Girlfriend'

for 'the ex', i get:
'<em>Seinfeld</em>The <em>Ex</em><em>Ex</em>-Girlfriend'

and so on


thx much

--joe

Mime
View raw message