lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrea Gazzarini <a.gazzar...@sease.io>
Subject Re: Query with exact number of tokens
Date Fri, 21 Sep 2018 13:52:52 GMT
Hi Sergio,
assuming that you don't want to disable tokenisation (otherwise you can 
define the indexed field as a string and search it as a whole),
in "Relevant Search" the authors describe a cool approach using the so 
called "Sentinel Tokens", which are symbolic tokens representing the 
beginning and the end of a value (field value or query).

SENTINEL_BEGIN<value>SENTINEL_END

Those tokens could be injected at index and query time so the returned 
matches will be effectively "exact" matches. Matching docs will have 
exact values matching (that actually depends on the text analysis you 
apply) and the sentinels in the expected place (beginning + end)

Best,
Andrea

On 21/09/18 15:00, marotosg wrote:
> Hi,
>
> I have to search for company names where my first requirement is to find
> only exact matches on the company name.
>
> For instance if I search for "CENTURY BANCORP, INC." I shouldn't find "NEW
> CENTURY BANCORP, INC."
> because the result company has the extra keyword "NEW".
>
> I can't use exact match because the sequence of tokens may differ. Basically
> I need to find results where the  tokens are the same in any order and the
> number of tokens match.
>
> I have no idea if it's possible as include in the query the number of tokens
> and solr field has that info within to match it.
>
> Thanks for your help
> Sergio
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Mime
View raw message