lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig Smiles <smile...@gmail.com>
Subject Edismax, stopword query and non-existent fields
Date Fri, 02 Feb 2018 16:18:44 GMT
I've recently had a requirement request from the user to allow a query when
searching for stopwords alone. I discovered that edismax will already do
this for me, all I had to do was remove our StopFilterFactory on the index
analyzer so that the stopwords actually exist in our index.

So suppose I have a collection with 2 fields: an id with type string and a
field called "test" with type text_en. I remove the the StopFilterFactory
from the index analyzer but add it to the query analzyer. I then create 2
documents as below:

{ "id": 1, "test": "and" }
{ "id": 2, "test": "foo" }

"and" is a stopword. The below query only returns the document with id: 2
as expected:

http://localhost:8983/solr/test/select?defType=edismax&qf=test&q=and+foo

The below query returns only our document with id: 1 as expected:

http://localhost:8983/solr/test/select?defType=edismax&qf=test&q=and

So far so good.

We also have a requirement to allow partial searching on only some fields.
So some of our fields have an ngram equivalent, we postfix these fields
with "_ngram". So we'd sometimes end up with two fields: fieldname and
fieldname_ngram. We do this for all our fields, even the ones that don't
have an ngram equivalent, so the query will sometimes contain an fq with a
non-existent field:

http://localhost:8983/solr/test/select?defType=edismax&qf=test&qf=test_ngram&q=foo

This returns our id: 2 document, great! Notice that there's the
non-existent test_ngram added to a qf parameter. However, if I run a query
with a stopword:

http://localhost:8983/solr/test/select?defType=edismax&qf=test&qf=test_ngram&q=and

Then no documents are returned. This is inconvenient for us. It would be
better if edismax was liberal with non-existent fields. It also seems
inconsistent with the below query which will return the document with id: 1:

http://localhost:8983/solr/test/select?defType=edismax&qf=test&qf=test_ngram&q=and&stopwords=false

Would it be possible to ignore non-existent fields when a query only
contains stopwords? Or is there a good reason why this can't be implemented?

Regards,
Craig

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message