lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lasitha Wattaladeniya <watt...@gmail.com>
Subject Re: Highlighting words with special characters
Date Wed, 19 Jul 2017 08:59:51 GMT
Update,

I changed the UAX29URLEmailTokenizerFactory to StandardTokenizerFactory and
now it shows highlighted text fragments in the indexed email text.

But I don't understand this behavior. Can someone shed some light please

On 18 Jul 2017 14:18, "Lasitha Wattaladeniya" <wattale@gmail.com> wrote:

> Further more, ngram field has following tokenizer/filter chain in index
> and query
>
> UAX29URLEmailTokenizerFactory (only in index)
> stopFilterFactory
> LowerCaseFilterFactory
> ASCIIFoldingFilterFactory
> EnglishPossessiveFilterFactory
> StemmerOverrideFilterFactory (only in query)
> NgramTokenizerFactory (only in index)
>
> Regards,
> Lasitha
>
> On 18 Jul 2017 14:11, "Lasitha Wattaladeniya" <wattale@gmail.com> wrote:
>
>> Hi devs,
>>
>> I have setup solr highlighting with default setup (only changed the
>> fragsize to 0 to match any field length). It worked fine but recently I
>> discovered it doesn't highlight for words with special characters in the
>> middle.
>>
>> For an example, let's say I have indexed email address test.fsdg@ran.com
>> to a ngram field. And when I search for the partial text fsdg, I get the
>> results but it's not highlighted. It works in all other scenarios as
>> expected.
>>
>> The ngram field has termVectors, termPositions, termOffsets set to true.
>>
>> Can somebody please suggest me, what may be wrong here?
>>
>> (sorry for the unstructured text. Typed using a mobile phone )
>>
>> Regards
>> Lasitha
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message