lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tantius, Richard" <R.Tant...@binserv.de>
Subject AW: Edismax query parser and phrase queries
Date Mon, 03 Dec 2012 10:03:14 GMT
Hi,
the use case we have in mind is that we would like to achieve exact matches for explicit phrases.
Our users expect that an explicit phrase not only considers the order of terms, but also the
exact wording. Therefore if we search on fields using a data type that is not meant performing
exact matches, we need to change that for explicit phrases. This means in a usual query we
have qf default fields using advanced tokenization (for query processing and indexing), for
example like stemming via SnowballPorterFilterFactory. So our idea was to change the default
search fields for explicit phrases to achieve exact matches, by using a simple data format
like for example “string“ (StrField, without advanced options).

Extending our example from the last mail: 

qf="title text"

Datatype of title, text, something like “text_advanced”:

<fieldtype ...
 <analyzer type="index"> <!--(and also <analyzer type="query"> )-->
  <filter class="solr.WordDelimiterFilterFactory" ...>
  <filter class="solr.LowerCaseFilterFactory" />
  <filter class="solr.SnowballPorterFilterFactory" language="German2" />
...

Data type of the additional fields titleExact, textExact:
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>

q="ran away from home" Cat Dog 

-transformTo->

q=( titleExact:"ran away from home" OR textExact:"ran away from home" ) Cat Dog.

Regards,
Richard.

BINSERV
Gesellschaft für interaktive Konzepte und neue Medien mbH
Software Engineer

Gotenstr. 7-9
53175 Bonn
Tel.:     +49 (0)228 / 4 22 86 - 38 
Fax.:     +49 (0)228 / 4 22 86 - 538
E-Mail:   r.tantius@binserv.de  
Web:      www.binserv.de
      	  www.binforcepro.de

Geschäftsführer: Rüdiger Jakob
Amtsgericht: Siegburg HRB 6765
Hauptsitz der Gesellschaft.: Pfarrer-Wichert-Str. 35, 53639 Königswinter
Diese E-Mail einschließlich eventuell angehängter Dateien enthält vertrauliche und/oder
rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind und diese E-Mail
irrtümlich erhalten haben, dürfen Sie weder den Inhalt dieser E-Mail nutzen noch dürfen
Sie die eventuell angehängten Dateien öffnen und auch nichts kopieren oder weitergeben/verbreiten.
Bitte verständigen Sie den Absender und löschen Sie diese E-Mail und eventuell angehängte
Dateien umgehend. Vielen Dank!


----- Original message -----
Von: Jack Krupansky [mailto:jack@basetechnology.com] 
Gesendet: Freitag, 30. November 2012 23:04
An: solr-user@lucene.apache.org
Betreff: Re: Edismax query parser and phrase queries

I don’t have a simple answer for your stated issue, but maybe part of that is because I’m
not so sure what the exact problem/goal is. I mean, what’s so special about phrase queries
for your app than they need distinct processing from individual terms?

And, ultimately, what goal are you trying to achieve? Such as, how will the outcome of the
query affect what users see and do.

-- Jack Krupansky

From: Tantius, Richard
Sent: Friday, November 30, 2012 8:44 AM
To: solr-user@lucene.apache.org
Subject: Edismax query parser and phrase queries

Hi,

we are using the edismax query parser and execute queries on specific fields by using the
qf option. Like others, we are facing the problem we do not want explicit phrase queries to
be performed on some of the qf fields and also require additional search fields for those
kind of queries.

We tried to expand explicit phrases in a query by implementing some pre-processing logic,
which did not seemed to be quite convenient.

So for example (lets assume qf="title text", we want phrase queries to be performed on the
additional fields "titleAlt textAlt" ): q="ran away from home" Cat Dog -transformTo-> q=(
titleAlt:"ran away from home" OR textAlt:"ran away from home" ) Cat Dog. Unfortunately this
gets rather complicated if logic operators are involved within the query. Is there some kind
of best practice, should we for example extend the query parser, or stick to our pre-processing
approach?


Regards,
Richard.


Mime
View raw message