lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahmet Arslan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
Date Mon, 17 Mar 2014 17:35:45 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938085#comment-13938085
] 

Ahmet Arslan commented on SOLR-1604:
------------------------------------

Thanks [~erickerickson] for committing this! Here is what I compiled from README.txt from
zip distro.

After indexing example documents under example/exampledocs via 'java -jar post.jar *.xml'
utility

The query string 

{noformat}q=manu:"a* c*"&defType=complexphrase{noformat} or {noformat}q={!complexphrase
inOrder=true}manu:"a* c*" {noformat} will return :

http://localhost:8983/solr/collection1/select?q=manu:%22a*%20c*%22&defType=complexphrase&fl=manu

{code:xml}
<doc>
  <str name="manu">Apple Computer Inc.</str>
</doc>
<doc>
  <str name="manu">ASUS Computer Inc.</str>
</doc>
{code}


*inOrder* Parameter can be set in two ways.

1) Its default value is true. If you want to set it to false in a permanent way : register
query parser with a different name in solrconfig.xml
{code:xml}
 <!-- Un-ordered complex phrase query parser -->
  <queryParser name="unorderedcomplexphrase" class="org.apache.solr.search.ComplexPhraseQParserPlugin">
    <bool name="inOrder">false</bool>
  </queryParser>
{code}

2) At query time via local params. {noformat}q={!complexphrase inOrder=false df=name}"bla*
pla*"{noformat}

To mix ordered and unordered clauses in the same query.

{noformat}
+_query_:"{!complexphrase inOrder=true}manu:\"a* c*\""  +_query_:"{!complexphrase inOrder=false
df=name}\"bla* pla*\""  
{noformat}

h4. Limitations

h5. maxBooleanClauses

You may need to increase {code:xml}<maxBooleanClauses>1024</maxBooleanClauses>{code}
according to index size in solrconfig.xml because {noformat}"a* c*"{noformat} is expanded
into SpanNearQuery 
{noformat}
spanNear([spanOr([manu:a, manu:america, manu:apache, manu:apple, manu:asus, manu:ati]), spanOr([manu:canon,
manu:co, manu:computer, manu:corp, manu:corsair])], 0, false)
{noformat}

h5. Stopwords

Lets say we add *the*, *up*, *to* to collection1/conf/stopwords.txt file and re-index example
docs.
While {noformat}q=features:"Stores up to 15,000"{noformat} returns _"Stores up to 15,000 songs,
25,000 photos, or 150 hours of video"_,
{noformat}q=features:"sto* up to 15*"&defType=complexphrase {noformat} does not return
that document because SpanNearQuery has no good way to handle stopwords in a way analagous
to PhraseQuery. It is recommended not to use stopword elimination with this query parser.

> Wildcards, ORs etc inside Phrase Queries
> ----------------------------------------
>
>                 Key: SOLR-1604
>                 URL: https://issues.apache.org/jira/browse/SOLR-1604
>             Project: Solr
>          Issue Type: Improvement
>          Components: query parsers, search
>    Affects Versions: 1.4
>            Reporter: Ahmet Arslan
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: 4.8, 5.0
>
>         Attachments: ASF.LICENSE.NOT.GRANTED--ComplexPhrase.zip, ComplexPhrase-4.2.1.zip,
ComplexPhrase-4.7.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip,
ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, ComplexPhrase_solr_3.4.zip,
SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch, SOLR-1604.patch,
SOLR-1604.patch, SOLR-1604.patch, SOLR1604.patch
>
>
> Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs,
ranges, fuzzies inside phrase queries.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message