lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahmet Arslan (JIRA)" <>
Subject [jira] [Commented] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
Date Mon, 17 Mar 2014 17:35:45 GMT


Ahmet Arslan commented on SOLR-1604:

Thanks [~erickerickson] for committing this! Here is what I compiled from README.txt from
zip distro.

After indexing example documents under example/exampledocs via 'java -jar post.jar *.xml'

The query string 

{noformat}q=manu:"a* c*"&defType=complexphrase{noformat} or {noformat}q={!complexphrase
inOrder=true}manu:"a* c*" {noformat} will return :


  <str name="manu">Apple Computer Inc.</str>
  <str name="manu">ASUS Computer Inc.</str>

*inOrder* Parameter can be set in two ways.

1) Its default value is true. If you want to set it to false in a permanent way : register
query parser with a different name in solrconfig.xml
 <!-- Un-ordered complex phrase query parser -->
  <queryParser name="unorderedcomplexphrase" class="">
    <bool name="inOrder">false</bool>

2) At query time via local params. {noformat}q={!complexphrase inOrder=false df=name}"bla*

To mix ordered and unordered clauses in the same query.

+_query_:"{!complexphrase inOrder=true}manu:\"a* c*\""  +_query_:"{!complexphrase inOrder=false
df=name}\"bla* pla*\""  

h4. Limitations

h5. maxBooleanClauses

You may need to increase {code:xml}<maxBooleanClauses>1024</maxBooleanClauses>{code}
according to index size in solrconfig.xml because {noformat}"a* c*"{noformat} is expanded
into SpanNearQuery 
spanNear([spanOr([manu:a, manu:america, manu:apache, manu:apple, manu:asus, manu:ati]), spanOr([manu:canon,
manu:co, manu:computer, manu:corp, manu:corsair])], 0, false)

h5. Stopwords

Lets say we add *the*, *up*, *to* to collection1/conf/stopwords.txt file and re-index example
While {noformat}q=features:"Stores up to 15,000"{noformat} returns _"Stores up to 15,000 songs,
25,000 photos, or 150 hours of video"_,
{noformat}q=features:"sto* up to 15*"&defType=complexphrase {noformat} does not return
that document because SpanNearQuery has no good way to handle stopwords in a way analagous
to PhraseQuery. It is recommended not to use stopword elimination with this query parser.

> Wildcards, ORs etc inside Phrase Queries
> ----------------------------------------
>                 Key: SOLR-1604
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>          Components: query parsers, search
>    Affects Versions: 1.4
>            Reporter: Ahmet Arslan
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.8, 5.0
>         Attachments:,,,,,,,,,,,
SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch,
SOLR-1604.patch, SOLR-1604.patch, SOLR1604.patch
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs,
ranges, fuzzies inside phrase queries.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message