lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Rowe <sar...@gmail.com>
Subject Re: Solr Shingle is not working properly in solr 6.5.0
Date Tue, 04 Apr 2017 21:32:14 GMT
Hi Aman,

I’ve created <https://issues.apache.org/jira/browse/SOLR-10423> for this problem.

--
Steve
www.lucidworks.com

> On Mar 31, 2017, at 7:34 AM, Aman Deep Singh <amandeep.cool99@gmail.com> wrote:
> 
> Hi Rich,
> Query creation is correct only thing what causing the problem is that
> Boolean + query while building the lucene query which causing all tokens to
> be matched in the document (equivalent of mm=100%) even though I use mm=1
> it was using BOOLEAN + query as
> normal query one plus one abc
> Lucene query -
> +(((+nameShingle:one plus +nameShingle:plus one +nameShingle:one abc))
> ((+nameShingle:one plus +nameShingle:plus one abc)) ((+nameShingle:one plus
> one +nameShingle:one abc)) (nameShingle:one plus one abc))
> 
> Now since my doc contains only one plus one thus --
> one plus ,plus one, one plus one
> thus due to Boolean + it was not matching.
> Thanks,
> Aman Deep Singh
> 
> On Fri, Mar 31, 2017 at 4:41 PM Rick Leir <rleir@leirtech.com> wrote:
> 
>> Hi Aman
>> Did you try the Admin Analysis tool? It will show you which filters are
>> effective at index and query time. It will help you understand why you are
>> not getting a mach.
>> Cheers -- Rick
>> 
>> On March 31, 2017 2:36:33 AM EDT, Aman Deep Singh <
>> amandeep.cool99@gmail.com> wrote:
>>> Hi,
>>> I was trying to use the shingle filter but it was not creating the
>>> query as
>>> desirable.
>>> 
>>> my schema is
>>> <fieldType name="cust_shingle" class="solr.TextField"
>>> positionIncrementGap=
>>> "100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/>
>>> <filter
>>> class="solr.ShingleFilterFactory" outputUnigrams="false"
>>> maxShingleSize="4"
>>> /> <filter class="solr.LowerCaseFilterFactory"/> </analyzer>
>>> </fieldType>
>>> <field name="nameShingle" type="cust_shingle" indexed="true"
>>> stored="true"/>
>>> 
>>> my solr query is
>>> 
>> http://localhost:8983/solr/productCollection/select?defType=edismax&debugQuery=true&q=one%20plus%20one%20four&qf=nameShingle&
>>> *sow=false*&wt=xml
>>> 
>>> and it was creating the parsed query as
>>> <str name="parsedquery">
>>> (+(DisjunctionMaxQuery(((+nameShingle:one plus +nameShingle:plus one
>>> +nameShingle:one four))) DisjunctionMaxQuery(((+nameShingle:one plus
>>> +nameShingle:plus one four))) DisjunctionMaxQuery(((+nameShingle:one
>>> plus
>>> one +nameShingle:one four))) DisjunctionMaxQuery((nameShingle:one plus
>>> one
>>> four)))~1)/no_coord
>>> </str>
>>> <str name="parsedquery_toString">
>>> *+((((+nameShingle:one plus +nameShingle:plus one +nameShingle:one
>>> four))
>>> ((+nameShingle:one plus +nameShingle:plus one four)) ((+nameShingle:one
>>> plus one +nameShingle:one four)) (nameShingle:one plus one four))~1)*
>>> </str>
>>> 
>>> 
>>> So ideally token creations is perfect but in the query it is using
>>> boolean + operator which is causing the problem as if i have a document
>>> with name as
>>> "one plus one" ,according to the shingles it has to matched as its
>>> token
>>> will be  ("one plus","one plus one","plus one") .
>>> I have tried using the q.op and played around the mm also but nothing
>>> is
>>> giving me the correct response.
>>> Any idea how i can fetch that document even if the document is missing
>>> any
>>> token.
>>> 
>>> My expected response will be getting the document
>>> "one plus one" even the user query has any additional term like "one
>>> plus
>>> one two" and so on.
>>> 
>>> 
>>> Thanks,
>>> Aman Deep Singh
>> 
>> --
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.


Mime
View raw message