lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jérôme Etévé <jerome.et...@gmail.com>
Subject Re: Multifield query parser and phrase query behaviour from 1.3 to 1.4
Date Wed, 28 Oct 2009 15:52:43 GMT
Mea maxima culpa,

I had foolishly set the option  omitTermFreqAndPositions="false" in an
attempt to save space.
It works when this is set to 'true'.

However, even when it's set to 'false' , the highlighting of a field
continues to work even if the search doesn't.
Does the highlighter use a different strategy to match the query terms
in the fields?

Cheers!

Jerome.

2009/10/27 Jérôme Etévé <jerome.eteve@gmail.com>:
> Actually here is the difference between the textgen analysis pipeline and our:
>
> For the phrase "ingenieur d'affaire senior" ,
> Our pipeline gives right after our tokenizer:
>
> term position   1       2       3       4
> term text       ingenieur       d       affaire senior
>
> 'd' and 'affaire' are separated as different tokens straight away. Our
> filters have no later effect for this phrase.
>
> * The textgen pipeline uses a whitespace tokenizer, so it gives first:
> term position   1       2       3
> term text       ingenieur       d'affaire       senior
> term type       word    word    word
> source start,end        0,9     10,19   20,26
>
> * Then a word delimiter filter splits the token "d'affaire" (and
> generate the concatenation):
> erm position    1       2       3       4
> term text       ingenieur       d       affaire senior
> daffaire
> term type       word    word    word    word
> word
> source start,end        0,9     10,11   12,19   20,26
> 10,19
>
>
> Could you see a reason why title:"d affaire" works with textgen but
> not with our type?

Mime
View raw message