lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From baris.ka...@oracle.com
Subject Re: FuzzyQuery
Date Mon, 10 Jun 2019 18:24:51 GMT
[+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire", 
+contentDFLT:"country united states", contentDFLT:street contentDFLT:mains]

QueeryParser chops it into two pieces from 
parser.parser("street=\"MAINS\"");

Index has a TextField named contentDFLT the following data :
street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW 
HAMPSHIRE" country="UNITED STATES"


When i set street=\"MAINS~\" with parser:
i get the following
[+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire", 
+contentDFLT:"country united states", contentDFLT:street contentDFLT:mains]

probably " quotations are messing this up as You were saying...
Best regards


On 6/10/19 12:48 PM, Tomoko Uchida wrote:
> Or, " (double quotation) in your query string may affect query parsing.
>
> When I parse this string by classic query parser (lucene 8.1),
> street="MAINS~"
> parsed (raw) query is
> text:street text:mains
> (I set the default search field to "text", so text:xxxx is appeared here.)
>
> Query parsing is a complex process, so it would be good to check
> parsed raw query string especially when you have (reserved) special
> characters in your query...
>
> 2019年6月11日(火) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>> Hi,
>>
>> I noticed one small thing in your previous mail.
>>
>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>> which is good.
>>
>> To specify a search field, ":" (colon) should be used instead of "=".
>> See the query parser documentation:
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=
>>
>> I'm not sure this is related to your problem.
>>
>> 2019年6月11日(火) 0:51 <baris.kazar@oracle.com>:
>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>           booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>           booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>
>>>           org.apache.lucene.queryparser.classic.QueryParser parser = new
>>> org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>>>           Query q1 = null;
>>>           try {
>>>               q1 = parser.parse("MAIN");
>>>           } catch (ParseException e) {
>>>
>>>               e.printStackTrace();
>>>           }
>>>           booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>
>>> testQuerySearch2 Time to compute: 0 seconds
>>> Number of results: 1775
>>> Name: Main St
>>> Score: 37.20959
>>> ID: 12681979
>>> Country Code: US
>>> Coordinates: 42.76416, -71.46681
>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>
>>> Name: Main St
>>> Score: 37.20959
>>> ID: 12681977
>>> Country Code: US
>>> Coordinates: 42.747, -71.45957
>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>
>>> Name: Main St
>>> Score: 37.20959
>>> ID: 12681978
>>> Country Code: US
>>> Coordinates: 42.73492, -71.44951
>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>
>>>    when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>> which is good.
>>>
>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>
>>>
>>> i need to say something with the q1 only in the booleanquery:
>>> it tries to match the MAIN in street, city, region and country which are
>>> in a single TextField field.
>>> But i dont want this. that is why i need to street="..." etc when searching.
>>>
>>> Best regards
>>>
>>>
>>>
>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>> Hi,
>>>>
>>>> just for the basic verification, can you find the document without
>>>> fuzzy query? I mean, does this query work for you?
>>>>
>>>> Query query = parser.parse("MAIN");
>>>>
>>>> Tomoko
>>>>
>>>> 2019年6月11日(火) 0:22 <baris.kazar@oracle.com>:
>>>>> why cant the second set not work at all?
>>>>>
>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>>
>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>> i dont know how to use Fuzzyquery with queryparser but probably You
>>>>>> are suggesting
>>>>>>
>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>
>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>
>>>>>> am i right?
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>> I would suggest using a QueryParser for your fuzzy query before
>>>>>>> adding it to the Boolean query. This should weed out any case
issues.
>>>>>>>
>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>
>>>>>>>       BooleanQuery.Builder booleanQuery = new BooleanQuery.Builder();
>>>>>>>
>>>>>>>       //First set
>>>>>>>
>>>>>>>               booleanQuery.add(new FuzzyQuery(new
>>>>>>>       org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>>       BooleanClause.Occur.SHOULD);
>>>>>>>       booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
field,
>>>>>>>       "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>       booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
field,
>>>>>>>       "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>       booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
field,
>>>>>>>       "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>
>>>>>>>       // Second set
>>>>>>>                //booleanQuery.add(new FuzzyQuery(new
>>>>>>>       org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>>>       BooleanClause.Occur.SHOULD);
>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>       field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>       field, "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>       field, "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>>
>>>>>>>       The first set brings also street with Nashua name. (NASHUA).
>>>>>>>
>>>>>>>       so, to prevent that and since i also indexed with street="..."
>>>>>>>       city="..." i did the second set but it does not bring anything.
>>>>>>>
>>>>>>>       createPhraseQuery builds a Phrasequery with one term equal
to the
>>>>>>>       string
>>>>>>>       in the call.
>>>>>>>
>>>>>>>       Best regards
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>       On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>       <mailto:baris.kazar@oracle.com> wrote:
>>>>>>>       > How do i check how it is indexed? lowecase or uppercase?
>>>>>>>       >
>>>>>>>       > only way is now to by testing.
>>>>>>>       >
>>>>>>>       > i am using standardanalyzer.
>>>>>>>       >
>>>>>>>       > Best regards
>>>>>>>       >
>>>>>>>       >
>>>>>>>       > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>>       >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko Uchida
>>>>>>>       >> <tomoko.uchida.1111@gmail.com
>>>>>>>       <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>>       >>> Hi,
>>>>>>>       >>>
>>>>>>>       >>> What analyzer do you use for the text field?
Is the term "Main"
>>>>>>>       >>> correctly indexed?
>>>>>>>       >> Agreed. Also, it would be good if you could post
your actual
>>>>>>> code.
>>>>>>>       >>
>>>>>>>       >> What analyzer are you using? If you are using
StandardAnalyzer,
>>>>>>>       then
>>>>>>>       >> all of your terms while indexing will be lowercased,
AFAIK, but
>>>>>>>       your
>>>>>>>       >> query will not be analyzed until you run a QueryParser
on it.
>>>>>>>       >>
>>>>>>>       >>
>>>>>>>       >> Atri
>>>>>>>       >>
>>>>>>>       >
>>>>>>>       >
>>>>>>>       >
>>>>>>> ---------------------------------------------------------------------
>>>>>>>       > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>       <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>       > For additional commands, e-mail:
>>>>>>>       java-user-help@lucene.apache.org
>>>>>>>       <mailto:java-user-help@lucene.apache.org>
>>>>>>>       >
>>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message