lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From baris.ka...@oracle.com
Subject Re: FuzzyQuery- why is it ignored?
Date Wed, 12 Jun 2019 21:32:37 GMT
Ok, i think only this very specific only "mains" has an issue.

all i knew about Lucene was fine :) Great...

i have one more question:

which one is advised to use: FuzzyQuery or the Query.parser with search 
string~ appended?

The second one will go through analyzer and make search string lowercase.

Best regards


On 6/12/19 1:03 PM, baris.kazar@oracle.com wrote:
>
> Hi again,-
>
> this is really interesting and i hope i am missing something. Index 
> small cases all entries so case sensitivity is not an issue i think.
>
> Case #1:
>
> org.apache.lucene.queryparser.classic.QueryParser parser = new 
> org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>         Query q1 = null;
>         try {
>             q1 = parser.parse("Main");
>         } catch (ParseException e) {
>             e.printStackTrace();
>         }
>         booleanQuery.add(q1, BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, 
> field, "NASHUA"), BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, 
> field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, 
> field, "UNITED STATES"), BooleanClause.Occur.MUST);
>
>
> This brings *with this:*
>
> *query plan:
> *
>
> *[+contentDFLT:main, +contentDFLT:"nashua", 
> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]*
>
> testQuerySearch1 Time to compute: 0 seconds (copied answer after exec 
> finished)
>
> Number of results: 12
> Name: Main Dunstable Rd
> Score: 41.204945
> ID: 12677400
> Country Code: US
> Coordinates: 42.72631, -71.50269
> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681980
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681973
> Country Code: US
> Coordinates: 42.75045, -71.4607
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681974
> Country Code: US
> Coordinates: 42.76019, -71.465
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main Dunstable Rd
> Score: 41.204945
> ID: 12677399
> Country Code: US
> Coordinates: 42.74641, -71.48943
> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: S Main St
> Score: 41.204945
> ID: 11893215
> Country Code: US
> Coordinates: 42.73412, -71.44797
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681978
> Country Code: US
> Coordinates: 42.73492, -71.44951
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: S Main St
> Score: 41.204945
> ID: 11893214
> Country Code: US
> Coordinates: 42.73958, -71.45895
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681979
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.204945
> ID: 12681977
> Country Code: US
> Coordinates: 42.747, -71.45957
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
>
>
> Case #2
>
> When i did this it also worked by adding ~ to make it Fuzzy query to 
> Main word:
>
> org.apache.lucene.queryparser.classic.QueryParser parser = new 
> org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>         Query q1 = null;
>         try {
>             q1 = parser.parse("Main~");
>         } catch (ParseException e) {
>             e.printStackTrace();
>         }
>         booleanQuery.add(q1, BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, 
> field, "NASHUA"), BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, 
> field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, 
> field, "UNITED STATES"), BooleanClause.Occur.MUST);
>
> *query plan:**
> **
> **[+contentDFLT:main~2, +contentDFLT:"nashua", 
> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]*
>
> testQuerySearch1 Time to compute: 24 seconds (due to debugging stops)
> Number of results: 12
> Name: Main Dunstable Rd
> Score: 41.06405
> ID: 12677400
> Country Code: US
> Coordinates: 42.72631, -71.50269
> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681980
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681973
> Country Code: US
> Coordinates: 42.75045, -71.4607
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681974
> Country Code: US
> Coordinates: 42.76019, -71.465
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main Dunstable Rd
> Score: 41.06405
> ID: 12677399
> Country Code: US
> Coordinates: 42.74641, -71.48943
> Search Key: MAIN DUNSTABLE NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: S Main St
> Score: 41.06405
> ID: 11893215
> Country Code: US
> Coordinates: 42.73412, -71.44797
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681978
> Country Code: US
> Coordinates: 42.73492, -71.44951
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: S Main St
> Score: 41.06405
> ID: 11893214
> Country Code: US
> Coordinates: 42.73958, -71.45895
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681979
> Country Code: US
> Coordinates: 42.76416, -71.46681
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Main St
> Score: 41.06405
> ID: 12681977
> Country Code: US
> Coordinates: 42.747, -71.45957
> Search Key: MAIN NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
>
>
>
> Case #3
>
> But why does this not work with fuzzy mode and i misspelled a bit (1 
> edit away) and as You saw the data is there with Main spelling:
>
> org.apache.lucene.queryparser.classic.QueryParser parser = new 
> org.apache.lucene.queryparser.classic.QueryParser(field, phraseAnalyzer) ;
>
>         Query q1 = null;
>         try {
>             q1 = parser.parse("Mains~");  // 1 edit away
>         } catch (ParseException e) {
>             e.printStackTrace();
>         }
>         booleanQuery.add(q1, BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, 
> field, "NASHUA"), BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, 
> field, "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>         booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, 
> field, "UNITED STATES"), BooleanClause.Occur.MUST);
>
> *query plan:**
> **
> **[+contentDFLT:mains~2, +contentDFLT:"nashua", 
> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]**
> *
>
> testQuerySearch1 Time to compute: 23 seconds (due to debugging stops)
>
> Number of results: 0
>
>
>
> Case #4
>
> Then i changed q1 to SHOULD from MUST above: and i think fuzzy query 
> is ignored here since there is no MAIN in the first 468 resuls:
>
> there is no boost for Mains term here.
>
> *query plan:*
>
> *[contentDFLT:mains~2, +contentDFLT:"nashua", 
> +contentDFLT:"new-hampshire", +contentDFLT:"united states"]**
> *
>
> testQuerySearch1 Time to compute: 125 seconds (due to debugging stops)
> Number of results: 1794
> Name: Nashua Dr
> Score: 34.186226
> ID: 4974936
> Country Code: US
> Coordinates: 42.7636, -71.46063
> Search Key: NASHUA NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Nashua River Rail Trl
> Score: 34.186226
> ID: 4975508
> Country Code: US
> Coordinates: 42.7062, -71.53962
> Search Key: NASHUA RIVER RAIL NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED 
> STATES
>
> Name: Nashua Rd
> Score: 33.84896
> ID: 4975388
> Country Code: US
> Coordinates: 42.78746, -71.92823
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: NASHUA
> Score: 33.84896
> ID: 21014865
> Country Code: US
> Coordinates: 42.75873, -71.46438
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
> Name: Nashua St
> Score: 33.84896
> ID: 4975671
> Country Code: US
> Coordinates: 42.88471, -70.81687
> Search Key: NASHUA ROCKINGHAM NEW HAMPSHIRE UNITED STATES
>
> Name: Nashua Rd
> Score: 33.84896
> ID: 4975400
> Country Code: US
> Coordinates: 42.79014, -71.92364
> Search Key: NASHUA HILLSBOROUGH NEW HAMPSHIRE UNITED STATES
>
>
> Why is the fuzzy query ignored?
> Even if i have separate fields for street, city,region, country, this 
> fuzzy query issue will come into place for words with multiple parts 
> like main dunstable etc., right?
>
> Best regards
>
> On 6/12/19 11:36 AM, baris.kazar@oracle.com wrote:
>> Tomoko,-
>>
>>  Thank You for Your suggestions. i am trying to understand it and i 
>> thought i did :)
>>
>> but it does not work with FuzzyQuery when i used with a *single* 
>> large TextField like street=...value... city=...value... 
>> region=...value... country=...value... (with or without quotes for 
>> the values)
>>
>> What i knew about Lucene fuzzy queries are not holding now with this 
>> Textfield form. That is why i suspected of a bug.
>>
>> 1. Yes, i saw and have a solid proof on that now.
>>
>> 2. yes but FuzzyQuery takes quotes as they are as they are escaped 
>> and it is not analyzed.
>>
>> Stuffing into one textfield vs having separate fields should only 
>> affect probably the performance but not the outcome in my case.
>> But, i have been thinking about this and maybe it is the way to go in 
>> this case.
>>
>> mY CONTENT field has street names in mixed case and city, region 
>> country names in UPPERCASE. Can this be a problem?
>> i thought index stored them in lowercase since i am using 
>> StandardAnalyzer.
>>
>> CONTENT field also has full textfield string with street=... city=... 
>> region=... country=... (here all values are UPPERCASE).
>>
>> Why cant the index find the names via FuzzyQuery? i tried both 
>> FuzzyQuery and Query builder as i showed before.
>>
>> The last advice in Your previous email would nicely go outside the 
>> parantheses since it might be very critical :) :) :)
>>
>> Best regards
>>
>>
>> On 6/12/19 12:17 AM, Tomoko Uchida wrote:
>>> I'd suggest to correctly understand the way a software works before
>>> suspecting its bug :-)
>>>
>>> I guess you may miss two points:
>>>
>>> 1. the standard analyzer (standard tokenizer) breaks words by double
>>> quote (U+0022) so quotes are not indexed or searched at all if you are
>>> using standard analyzer. (That is the reason you have same results
>>> with or without quotes.)
>>> See: 
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_8-5F1-5F0_core_org_apache_lucene_analysis_standard_StandardTokenizer.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=8E2lp1YIGM-3v3FspeieGl8z8rEBs6qioTudtFNzh8c&e=
>>> and 
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__unicode.org_reports_tr29_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=riCZ_f25XW869CKbHPUqfbLiDU-AukE6la0xTLMw6u8&e=
>>>
>>> 2. double quote has special meaning (it's interpreted as phrase query)
>>> with the built-in query parser so you need to escape it if you want to
>>> search double quotes itself.
>>> See: 
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Terms&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=1L6ZQKxmWmYxDX4uJHxzY5SAR_UCl6UUXCo916wzXCo&s=t8OYTgidvcwNpAVFuTsqGhDJK5BwUZVCxc0mPHzqCYU&e=
>>>
>>> (My advice would be to create separate fields for each key value pairs
>>> instead of stuffing all pairs into one text field, if you need to
>>> search them separately.)
>>>
>>> 2019年6月12日(水) 2:39 <baris.kazar@oracle.com>:
>>>> i can say that quotes is not the issue with index as it still 
>>>> results in
>>>> same results with quotes or without quotes.
>>>>
>>>> i am starting to feel that this might be a bug maybe??
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 6/10/19 2:46 PM, baris.kazar@oracle.com wrote:
>>>>> Somehow " is causing an issue as this should return street with MAIN:
>>>>>
>>>>> [contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua",
>>>>> +contentDFLT:"region new-hampshire", +contentDFLT:"country united
>>>>> states"] -> this was with fuzzyquery on MAINS
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
>>>>>> [+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>> contentDFLT:mains]
>>>>>>
>>>>>> QueeryParser chops it into two pieces from
>>>>>> parser.parser("street=\"MAINS\"");
>>>>>>
>>>>>> Index has a TextField named contentDFLT the following data :
>>>>>> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW
>>>>>> HAMPSHIRE" country="UNITED STATES"
>>>>>>
>>>>>>
>>>>>> When i set street=\"MAINS~\" with parser:
>>>>>> i get the following
>>>>>> [+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire",
>>>>>> +contentDFLT:"country united states", contentDFLT:street
>>>>>> contentDFLT:mains]
>>>>>>
>>>>>> probably " quotations are messing this up as You were saying...
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>>>>>>> Or, " (double quotation) in your query string may affect query

>>>>>>> parsing.
>>>>>>>
>>>>>>> When I parse this string by classic query parser (lucene 8.1),
>>>>>>> street="MAINS~"
>>>>>>> parsed (raw) query is
>>>>>>> text:street text:mains
>>>>>>> (I set the default search field to "text", so text:xxxx is appeared
>>>>>>> here.)
>>>>>>>
>>>>>>> Query parsing is a complex process, so it would be good to check
>>>>>>> parsed raw query string especially when you have (reserved) special
>>>>>>> characters in your query...
>>>>>>>
>>>>>>> 2019年6月11日(火) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:

>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I noticed one small thing in your previous mail.
>>>>>>>>
>>>>>>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get
same 
>>>>>>>>> results
>>>>>>>> which is good.
>>>>>>>>
>>>>>>>> To specify a search field, ":" (colon) should be used instead

>>>>>>>> of "=".
>>>>>>>> See the query parser documentation:
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=

>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm not sure this is related to your problem.
>>>>>>>>
>>>>>>>> 2019年6月11日(火) 0:51 <baris.kazar@oracle.com>:
>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
field,
>>>>>>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
field,
>>>>>>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
field,
>>>>>>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>>>>>>
>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser parser
= new
>>>>>>>>> org.apache.lucene.queryparser.classic.QueryParser(field,
>>>>>>>>> phraseAnalyzer) ;
>>>>>>>>>            Query q1 = null;
>>>>>>>>>            try {
>>>>>>>>>                q1 = parser.parse("MAIN");
>>>>>>>>>            } catch (ParseException e) {
>>>>>>>>>
>>>>>>>>>                e.printStackTrace();
>>>>>>>>>            }
>>>>>>>>>            booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>>>>>>
>>>>>>>>> testQuerySearch2 Time to compute: 0 seconds
>>>>>>>>> Number of results: 1775
>>>>>>>>> Name: Main St
>>>>>>>>> Score: 37.20959
>>>>>>>>> ID: 12681979
>>>>>>>>> Country Code: US
>>>>>>>>> Coordinates: 42.76416, -71.46681
>>>>>>>>> Search Key: street="MAIN" city="NASHUA" 
>>>>>>>>> municipality="HILLSBOROUGH"
>>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>
>>>>>>>>> Name: Main St
>>>>>>>>> Score: 37.20959
>>>>>>>>> ID: 12681977
>>>>>>>>> Country Code: US
>>>>>>>>> Coordinates: 42.747, -71.45957
>>>>>>>>> Search Key: street="MAIN" city="NASHUA" 
>>>>>>>>> municipality="HILLSBOROUGH"
>>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>
>>>>>>>>> Name: Main St
>>>>>>>>> Score: 37.20959
>>>>>>>>> ID: 12681978
>>>>>>>>> Country Code: US
>>>>>>>>> Coordinates: 42.73492, -71.44951
>>>>>>>>> Search Key: street="MAIN" city="NASHUA" 
>>>>>>>>> municipality="HILLSBOROUGH"
>>>>>>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>>>>>>
>>>>>>>>>     when i use q1 = parser.parse("street=\"MAIN\"");
i get same
>>>>>>>>> results
>>>>>>>>> which is good.
>>>>>>>>>
>>>>>>>>> But when i switch to MAINS~ then fuzzy query does not
work.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> i need to say something with the q1 only in the booleanquery:
>>>>>>>>> it tries to match the MAIN in street, city, region and
country
>>>>>>>>> which are
>>>>>>>>> in a single TextField field.
>>>>>>>>> But i dont want this. that is why i need to street="..."
etc when
>>>>>>>>> searching.
>>>>>>>>>
>>>>>>>>> Best regards
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> just for the basic verification, can you find the
document 
>>>>>>>>>> without
>>>>>>>>>> fuzzy query? I mean, does this query work for you?
>>>>>>>>>>
>>>>>>>>>> Query query = parser.parse("MAIN");
>>>>>>>>>>
>>>>>>>>>> Tomoko
>>>>>>>>>>
>>>>>>>>>> 2019年6月11日(火) 0:22 <baris.kazar@oracle.com>:
>>>>>>>>>>> why cant the second set not work at all?
>>>>>>>>>>>
>>>>>>>>>>> it is indexed as Textfield like street="..."
city="..." etc.
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>>>>>>> i dont know how to use Fuzzyquery with queryparser
but 
>>>>>>>>>>>> probably
>>>>>>>>>>>> You
>>>>>>>>>>>> are suggesting
>>>>>>>>>>>>
>>>>>>>>>>>> QueryParser parser = new QueryParser(field,
analyzer) ;
>>>>>>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>>>>>>
>>>>>>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>>>>>>
>>>>>>>>>>>> am i right?
>>>>>>>>>>>> Best regards
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>>>>>>> I would suggest using a QueryParser for
your fuzzy query 
>>>>>>>>>>>>> before
>>>>>>>>>>>>> adding it to the Boolean query. This
should weed out any case
>>>>>>>>>>>>> issues.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>>>>>>> <mailto:baris.kazar@oracle.com>>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>        BooleanQuery.Builder booleanQuery
= new
>>>>>>>>>>>>> BooleanQuery.Builder();
>>>>>>>>>>>>>
>>>>>>>>>>>>>        //First set
>>>>>>>>>>>>>
>>>>>>>>>>>>>                booleanQuery.add(new
FuzzyQuery(new
>>>>>>>>>>>>>        org.apache.lucene.index.Term(field,
"MAINS")),
>>>>>>>>>>>>>        BooleanClause.Occur.SHOULD);
>>>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,

>>>>>>>>>>>>> field,
>>>>>>>>>>>>>        "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,

>>>>>>>>>>>>> field,
>>>>>>>>>>>>>        "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,

>>>>>>>>>>>>> field,
>>>>>>>>>>>>>        "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>>>>>>
>>>>>>>>>>>>>        // Second set
>>>>>>>>>>>>>                 //booleanQuery.add(new
FuzzyQuery(new
>>>>>>>>>>>>>        org.apache.lucene.index.Term(field,

>>>>>>>>>>>>> "street=\"MAINS\"")),
>>>>>>>>>>>>>        BooleanClause.Occur.SHOULD);
>>>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,

>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>        field, "city=\"NASHUA\""),
BooleanClause.Occur.MUST);
>>>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,

>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>        field, "region=\"NEW HAMPSHIRE\""),
>>>>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,

>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>        field, "country=\"UNITED
STATES\""),
>>>>>>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>>>>>>
>>>>>>>>>>>>>        The first set brings also
street with Nashua name.
>>>>>>>>>>>>> (NASHUA).
>>>>>>>>>>>>>
>>>>>>>>>>>>>        so, to prevent that and
since i also indexed with
>>>>>>>>>>>>> street="..."
>>>>>>>>>>>>>        city="..." i did the second
set but it does not bring
>>>>>>>>>>>>> anything.
>>>>>>>>>>>>>
>>>>>>>>>>>>>        createPhraseQuery builds
a Phrasequery with one term
>>>>>>>>>>>>> equal to the
>>>>>>>>>>>>>        string
>>>>>>>>>>>>>        in the call.
>>>>>>>>>>>>>
>>>>>>>>>>>>>        Best regards
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>        On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>>>>>>> <mailto:baris.kazar@oracle.com>
wrote:
>>>>>>>>>>>>>        > How do i check how
it is indexed? lowecase or 
>>>>>>>>>>>>> uppercase?
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        > only way is now to
by testing.
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        > i am using standardanalyzer.
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        > Best regards
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        > On 6/9/19 11:57 AM,
Atri Sharma wrote:
>>>>>>>>>>>>>        >> On Sun, Jun 9,
2019 at 8:53 PM Tomoko Uchida
>>>>>>>>>>>>>        >> <tomoko.uchida.1111@gmail.com
>>>>>>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>>
wrote:
>>>>>>>>>>>>>        >>> Hi,
>>>>>>>>>>>>>        >>>
>>>>>>>>>>>>>        >>> What analyzer
do you use for the text field? Is 
>>>>>>>>>>>>> the
>>>>>>>>>>>>> term "Main"
>>>>>>>>>>>>>        >>> correctly indexed?
>>>>>>>>>>>>>        >> Agreed. Also, it
would be good if you could post 
>>>>>>>>>>>>> your
>>>>>>>>>>>>> actual
>>>>>>>>>>>>> code.
>>>>>>>>>>>>>        >>
>>>>>>>>>>>>>        >> What analyzer are
you using? If you are using
>>>>>>>>>>>>> StandardAnalyzer,
>>>>>>>>>>>>>        then
>>>>>>>>>>>>>        >> all of your terms
while indexing will be 
>>>>>>>>>>>>> lowercased,
>>>>>>>>>>>>> AFAIK, but
>>>>>>>>>>>>>        your
>>>>>>>>>>>>>        >> query will not
be analyzed until you run a
>>>>>>>>>>>>> QueryParser on it.
>>>>>>>>>>>>>        >>
>>>>>>>>>>>>>        >>
>>>>>>>>>>>>>        >> Atri
>>>>>>>>>>>>>        >>
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>        >
>>>>>>>>>>>>> ---------------------------------------------------------------------

>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>        > To unsubscribe, e-mail:
>>>>>>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>>>>>>>        > For additional commands,
e-mail:
>>>>>>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>>>>>>>        >
>>>>>>>>>>>>>
>>>>>>>>>>> ---------------------------------------------------------------------

>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>>> For additional commands, e-mail: 
>>>>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------

>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>>> For additional commands, e-mail: 
>>>>>>>>>> java-user-help@lucene.apache.org
>>>>>>>>>>
>>>>>>> ---------------------------------------------------------------------

>>>>>>>
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message