lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From baris.ka...@oracle.com
Subject Re: FuzzyQuery- why is it ignored?
Date Wed, 12 Jun 2019 17:03:27 GMT
Hi again,-

this is really interesting and i hope i am missing something. Index 
small cases all entries so case sensitivity is not an issue i think.

Case #1:

org.apache.lucene.queryparser.classic.QueryParser parser = new 
org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
         Query q1 = null;
         try {
             q1 = parser.parse("Main");
         } catch (ParseException e) {
             e.printStackTrace();
         }
         booleanQuery.add(q1, BooleanClause.Occur.MUST);
         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, 
"NASHUA"), BooleanClause.Occur.MUST);
         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, 
"NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, 
"UNITED STATES"), BooleanClause.Occur.MUST);


This brings *with this:*

*query plan:
*

*[+contentDFLT:main, +contentDFLT:"nashua", 
+contentDFLT:"new-hampshire", +contentDFLT:"united states"]*

testQuerySearch1 Time to compute: 0 seconds (copied answer after exec 
finished)

Number of results: 12
Name: Main Dunstable Rd
Score: 41.204945
ID: 12677400
Country Code: US
Coordinates: 42.72631, -71.50269
Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.204945
ID: 12681980
Country Code: US
Coordinates: 42.76416, -71.46681
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.204945
ID: 12681973
Country Code: US
Coordinates: 42.75045, -71.4607
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.204945
ID: 12681974
Country Code: US
Coordinates: 42.76019, -71.465
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main Dunstable Rd
Score: 41.204945
ID: 12677399
Country Code: US
Coordinates: 42.74641, -71.48943
Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: S Main St
Score: 41.204945
ID: 11893215
Country Code: US
Coordinates: 42.73412, -71.44797
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.204945
ID: 12681978
Country Code: US
Coordinates: 42.73492, -71.44951
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: S Main St
Score: 41.204945
ID: 11893214
Country Code: US
Coordinates: 42.73958, -71.45895
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.204945
ID: 12681979
Country Code: US
Coordinates: 42.76416, -71.46681
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.204945
ID: 12681977
Country Code: US
Coordinates: 42.747, -71.45957
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES



Case #2

When i did this it also worked by adding ~ to make it Fuzzy query to 
Main word:

org.apache.lucene.queryparser.classic.QueryParser parser = new 
org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
         Query q1 = null;
         try {
             q1 = parser.parse("Main~");
         } catch (ParseException e) {
             e.printStackTrace();
         }
         booleanQuery.add(q1, BooleanClause.Occur.MUST);
         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, 
"NASHUA"), BooleanClause.Occur.MUST);
         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, 
"NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, 
"UNITED STATES"), BooleanClause.Occur.MUST);

*query plan:**
**
**[+contentDFLT:main~2, +contentDFLT:"nashua", 
+contentDFLT:"new-hampshire", +contentDFLT:"united states"]*

testQuerySearch1 Time to compute: 24 seconds (due to debugging stops)
Number of results: 12
Name: Main Dunstable Rd
Score: 41.06405
ID: 12677400
Country Code: US
Coordinates: 42.72631, -71.50269
Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.06405
ID: 12681980
Country Code: US
Coordinates: 42.76416, -71.46681
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.06405
ID: 12681973
Country Code: US
Coordinates: 42.75045, -71.4607
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.06405
ID: 12681974
Country Code: US
Coordinates: 42.76019, -71.465
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main Dunstable Rd
Score: 41.06405
ID: 12677399
Country Code: US
Coordinates: 42.74641, -71.48943
Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: S Main St
Score: 41.06405
ID: 11893215
Country Code: US
Coordinates: 42.73412, -71.44797
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.06405
ID: 12681978
Country Code: US
Coordinates: 42.73492, -71.44951
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: S Main St
Score: 41.06405
ID: 11893214
Country Code: US
Coordinates: 42.73958, -71.45895
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.06405
ID: 12681979
Country Code: US
Coordinates: 42.76416, -71.46681
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Main St
Score: 41.06405
ID: 12681977
Country Code: US
Coordinates: 42.747, -71.45957
Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES




Case #3

But why does this not work with fuzzy mode and i misspelled a bit (1 
edit away) and as You saw the data is there with Main spelling:

org.apache.lucene.queryparser.classic.QueryParser parser = new 
org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;

         Query q1 = null;
         try {
             q1 = parser.parse("Mains~");  // 1 edit away
         } catch (ParseException e) {
             e.printStackTrace();
         }
         booleanQuery.add(q1, BooleanClause.Occur.MUST);
         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, 
"NASHUA"), BooleanClause.Occur.MUST);
         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, 
"NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field, 
"UNITED STATES"), BooleanClause.Occur.MUST);

*query plan:**
**
**[+contentDFLT:mains~2, +contentDFLT:"nashua", 
+contentDFLT:"new-hampshire", +contentDFLT:"united states"]**
*

testQuerySearch1 Time to compute: 23 seconds (due to debugging stops)

Number of results: 0



Case #4

Then i changed q1 to SHOULD from MUST above: and i think fuzzy query is 
ignored here since there is no MAIN in the first 468 resuls:

there is no boost for Mains term here.

*query plan:*

*[contentDFLT:mains~2, +contentDFLT:"nashua", 
+contentDFLT:"new-hampshire", +contentDFLT:"united states"]**
*

testQuerySearch1 Time to compute: 125 seconds (due to debugging stops)
Number of results: 1794
Name: Nashua Dr
Score: 34.186226
ID: 4974936
Country Code: US
Coordinates: 42.7636, -71.46063
Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Nashua River Rail Trl
Score: 34.186226
ID: 4975508
Country Code: US
Coordinates: 42.7062, -71.53962
Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED 
STATES

Name: Nashua Rd
Score: 33.84896
ID: 4975388
Country Code: US
Coordinates: 42.78746, -71.92823
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: NASHUA
Score: 33.84896
ID: 21014865
Country Code: US
Coordinates: 42.75873, -71.46438
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: NASHUA
Score: 33.84896
ID: 21014865
Country Code: US
Coordinates: 42.75873, -71.46438
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: NASHUA
Score: 33.84896
ID: 21014865
Country Code: US
Coordinates: 42.75873, -71.46438
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: NASHUA
Score: 33.84896
ID: 21014865
Country Code: US
Coordinates: 42.75873, -71.46438
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: NASHUA
Score: 33.84896
ID: 21014865
Country Code: US
Coordinates: 42.75873, -71.46438
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES

Name: Nashua St
Score: 33.84896
ID: 4975671
Country Code: US
Coordinates: 42.88471, -70.81687
Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES

Name: Nashua Rd
Score: 33.84896
ID: 4975400
Country Code: US
Coordinates: 42.79014, -71.92364
Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES


Why is the fuzzy query ignored?
Even if i have separate fields for street, city,region, country, this 
fuzzy query issue will come into place for words with multiple parts 
like main dunstable etc., right?

Best regards

On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
> Tomoko,-
>
>  Thank You for Your suggestions. i am trying to understand it and i 
> thought i did :)
>
> but it does not work with FuzzyQuery when i used with a *single* large 
> TextField like street=...value... city=...value... region=...value... 
> country=...value... (with or without quotes for the values)
>
> What i knew about Lucene fuzzy queries are not holding now with this 
> Textfield form. That is why i suspected of a bug.
>
> 1. Yes, i saw and have a solid proof on that now.
>
> 2. yes but FuzzyQuery takes quotes as they are as they are escaped and 
> it is not analyzed.
>
> Stuffing into one textfield vs having separate fields should only 
> affect probably the performance but not the outcome in my case.
> But, i have been thinking about this and maybe it is the way to go in 
> this case.
>
> mY CONTENT field has street names in mixed case and city, region 
> country names in UPPERCASE. Can this be a problem?
> i thought index stored them in lowercase since i am using 
> StandardAnalyzer.
>
> CONTENT field also has full textfield string with street=... city=... 
> region=... country=... (here all values are UPPERCASE).
>
> Why cant the index find the names via FuzzyQuery? i tried both 
> FuzzyQuery and Query builder as i showed before.
>
> The last advice in Your previous email would nicely go outside the 
> parantheses since it might be very critical :) :) :)
>
> Best regards
>
>
> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>> I'd suggest to correctly understand the way a software works before
>> suspecting its bug :-)
>>
>> I guess you may miss two points:
>>
>> 1. the standard analyzer (standard tokenizer) breaks words by double
>> quote (U+0022) so quotes are not indexed or searched at all if you are
>> using standard analyzer. (That is the reason you have same results
>> with or without quotes.)
>> See: 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>> and 
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>
>> 2. double quote has special meaning (it's interpreted as phrase query)
>> with the built-in query parser so you need to escape it if you want to
>> search double quotes itself.
>> See: 
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>
>> (My advice would be to create separate fields for each key value pairs
>> instead of stuffing all pairs into one text field, if you need to
>> search them separately.)
>>
>> 2019年6月12日(水) 2:39 <baris.kazar@oracle.com>:
>>> i can say that quotes is not the issue with index as it still 
>>> results in
>>> same results with quotes or without quotes.
>>>
>>> i am starting to feel that this might be a bug maybe??
>>>
>>> Best regards
>>>
>>>
>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>> Somehow " is causing an issue as this should return street with MAIN:
>>>>
>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>>> states"] -> this was with fuzzyquery on MAINS
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>>> [+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>> contentDFLT:mains]
>>>>>
>>>>> QueeryParser chops it into two pieces from
>>>>> parser.parser("street=\"MAINS\"");
>>>>>
>>>>> Index has a TextField named contentDFLT the following data :
>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>>> HAMPSHIRE" country="UNITED STATES"
>>>>>
>>>>>
>>>>> When i set street=\"MAINS~\" with parser:
>>>>> i get the following
>>>>> [+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>> contentDFLT:mains]
>>>>>
>>>>> probably " quotations are messing this up as You were saying...
>>>>> Best regards
>>>>>
>>>>>
>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>>> Or, " (double quotation) in your query string may affect query 
>>>>>> parsing.
>>>>>>
>>>>>> When I parse this string by classic query parser (lucene 8.1),
>>>>>> street="MAINS~"
>>>>>> parsed (raw) query is
>>>>>> text:street text:mains
>>>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>>>> here.)
>>>>>>
>>>>>> Query parsing is a complex process, so it would be good to check
>>>>>> parsed raw query string especially when you have (reserved) special
>>>>>> characters in your query...
>>>>>>
>>>>>> 2019年6月11日(火) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I noticed one small thing in your previous mail.
>>>>>>>
>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same

>>>>>>>> results
>>>>>>> which is good.
>>>>>>>
>>>>>>> To specify a search field, ":" (colon) should be used instead
of 
>>>>>>> "=".
>>>>>>> See the query parser documentation:
>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=

>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I'm not sure this is related to your problem.
>>>>>>>
>>>>>>> 2019年6月11日(火) 0:51 <baris.kazar@oracle.com>:
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
field,
>>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
field,
>>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
field,
>>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser
= new
>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>> phraseAnalyzer) ;
>>>>>>>>            Query q1 = null;
>>>>>>>>            try {
>>>>>>>>                q1 = parser.parse("MAIN");
>>>>>>>>            } catch (ParseException e) {
>>>>>>>>
>>>>>>>>                e.printStackTrace();
>>>>>>>>            }
>>>>>>>>            booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>>>>
>>>>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>>>>> Number of results: 1775
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681979
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>> Search Key: street="MAIN" city="NASHUA" 
>>>>>>>> municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681977
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>> Search Key: street="MAIN" city="NASHUA" 
>>>>>>>> municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>> Name: Main St
>>>>>>>> Score: 37.20959
>>>>>>>> ID: 12681978
>>>>>>>> Country Code: US
>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>> Search Key: street="MAIN" city="NASHUA" 
>>>>>>>> municipality="HILLSBOROUGH"
>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>
>>>>>>>>     when i use q1 = parser.parse("street=\"MAIN\""); i
get same
>>>>>>>> results
>>>>>>>> which is good.
>>>>>>>>
>>>>>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>>>>>
>>>>>>>>
>>>>>>>> i need to say something with the q1 only in the booleanquery:
>>>>>>>> it tries to match the MAIN in street, city, region and country
>>>>>>>> which are
>>>>>>>> in a single TextField field.
>>>>>>>> But i dont want this. that is why i need to street="..."
etc when
>>>>>>>> searching.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> just for the basic verification, can you find the document

>>>>>>>>> without
>>>>>>>>> fuzzy query? I mean, does this query work for you?
>>>>>>>>>
>>>>>>>>> Query query = parser.parse("MAIN");
>>>>>>>>>
>>>>>>>>> Tomoko
>>>>>>>>>
>>>>>>>>> 2019年6月11日(火) 0:22 <baris.kazar@oracle.com>:
>>>>>>>>>> why cant the second set not work at all?
>>>>>>>>>>
>>>>>>>>>> it is indexed as Textfield like street="..." city="..."
etc.
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>>>>>> i dont know how to use Fuzzyquery with queryparser
but probably
>>>>>>>>>>> You
>>>>>>>>>>> are suggesting
>>>>>>>>>>>
>>>>>>>>>>> QueryParser parser = new QueryParser(field, analyzer)
;
>>>>>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>>>>>
>>>>>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>>>>>
>>>>>>>>>>> am i right?
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>>>>>> I would suggest using a QueryParser for your
fuzzy query 
>>>>>>>>>>>> before
>>>>>>>>>>>> adding it to the Boolean query. This should
weed out any case
>>>>>>>>>>>> issues.
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>>>>>> <mailto:baris.kazar@oracle.com>>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>        BooleanQuery.Builder booleanQuery
= new
>>>>>>>>>>>> BooleanQuery.Builder();
>>>>>>>>>>>>
>>>>>>>>>>>>        //First set
>>>>>>>>>>>>
>>>>>>>>>>>>                booleanQuery.add(new
FuzzyQuery(new
>>>>>>>>>>>>        org.apache.lucene.index.Term(field,
"MAINS")),
>>>>>>>>>>>>        BooleanClause.Occur.SHOULD);
>>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,

>>>>>>>>>>>> field,
>>>>>>>>>>>>        "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,

>>>>>>>>>>>> field,
>>>>>>>>>>>>        "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,

>>>>>>>>>>>> field,
>>>>>>>>>>>>        "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>>>
>>>>>>>>>>>>        // Second set
>>>>>>>>>>>>                 //booleanQuery.add(new
FuzzyQuery(new
>>>>>>>>>>>>        org.apache.lucene.index.Term(field,

>>>>>>>>>>>> "street=\"MAINS\"")),
>>>>>>>>>>>>        BooleanClause.Occur.SHOULD);
>>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,

>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>        field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,

>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>        field, "region=\"NEW HAMPSHIRE\""),
>>>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,

>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>        field, "country=\"UNITED STATES\""),
>>>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>>>
>>>>>>>>>>>>        The first set brings also street
with Nashua name.
>>>>>>>>>>>> (NASHUA).
>>>>>>>>>>>>
>>>>>>>>>>>>        so, to prevent that and since
i also indexed with
>>>>>>>>>>>> street="..."
>>>>>>>>>>>>        city="..." i did the second
set but it does not bring
>>>>>>>>>>>> anything.
>>>>>>>>>>>>
>>>>>>>>>>>>        createPhraseQuery builds a Phrasequery
with one term
>>>>>>>>>>>> equal to the
>>>>>>>>>>>>        string
>>>>>>>>>>>>        in the call.
>>>>>>>>>>>>
>>>>>>>>>>>>        Best regards
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>        On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>>>>>>        <mailto:baris.kazar@oracle.com>
wrote:
>>>>>>>>>>>>        > How do i check how it is
indexed? lowecase or 
>>>>>>>>>>>> uppercase?
>>>>>>>>>>>>        >
>>>>>>>>>>>>        > only way is now to by testing.
>>>>>>>>>>>>        >
>>>>>>>>>>>>        > i am using standardanalyzer.
>>>>>>>>>>>>        >
>>>>>>>>>>>>        > Best regards
>>>>>>>>>>>>        >
>>>>>>>>>>>>        >
>>>>>>>>>>>>        > On 6/9/19 11:57 AM, Atri
Sharma wrote:
>>>>>>>>>>>>        >> On Sun, Jun 9, 2019
at 8:53 PM Tomoko Uchida
>>>>>>>>>>>>        >> <tomoko.uchida.1111@gmail.com
>>>>>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>>
wrote:
>>>>>>>>>>>>        >>> Hi,
>>>>>>>>>>>>        >>>
>>>>>>>>>>>>        >>> What analyzer do
you use for the text field? Is the
>>>>>>>>>>>> term "Main"
>>>>>>>>>>>>        >>> correctly indexed?
>>>>>>>>>>>>        >> Agreed. Also, it would
be good if you could post 
>>>>>>>>>>>> your
>>>>>>>>>>>> actual
>>>>>>>>>>>> code.
>>>>>>>>>>>>        >>
>>>>>>>>>>>>        >> What analyzer are you
using? If you are using
>>>>>>>>>>>> StandardAnalyzer,
>>>>>>>>>>>>        then
>>>>>>>>>>>>        >> all of your terms while
indexing will be lowercased,
>>>>>>>>>>>> AFAIK, but
>>>>>>>>>>>>        your
>>>>>>>>>>>>        >> query will not be analyzed
until you run a
>>>>>>>>>>>> QueryParser on it.
>>>>>>>>>>>>        >>
>>>>>>>>>>>>        >>
>>>>>>>>>>>>        >> Atri
>>>>>>>>>>>>        >>
>>>>>>>>>>>>        >
>>>>>>>>>>>>        >
>>>>>>>>>>>>        >
>>>>>>>>>>>> ---------------------------------------------------------------------

>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>        > To unsubscribe, e-mail:
>>>>>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>>>>>>        > For additional commands,
e-mail:
>>>>>>>>>>>>        java-user-help@lucene.apache.org
>>>>>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>>>>>>        >
>>>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------

>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>> For additional commands, e-mail: 
>>>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------

>>>>>>>>>
>>>>>>>>>
>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>>
>>>>>> ---------------------------------------------------------------------

>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message