lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From baris.ka...@oracle.com
Subject Re: FuzzyQuery
Date Mon, 10 Jun 2019 18:46:41 GMT
Somehow " is causing an issue as this should return street with MAIN:

[contentDFLT:street="MAINS"~2, +contentDFLT:"city nashua", 
+contentDFLT:"region new-hampshire", +contentDFLT:"country united states"]

Best regards


On 6/10/19 2:24 PM, baris.kazar@oracle.com wrote:
> [+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire", 
> +contentDFLT:"country united states", contentDFLT:street 
> contentDFLT:mains]
>
> QueeryParser chops it into two pieces from 
> parser.parser("street=\"MAINS\"");
>
> Index has a TextField named contentDFLT the following data :
> street="MAIN" city="NASHUA" municipality="HILLSBOROUGH" region="NEW 
> HAMPSHIRE" country="UNITED STATES"
>
>
> When i set street=\"MAINS~\" with parser:
> i get the following
> [+contentDFLT:"city nashua", +contentDFLT:"region new-hampshire", 
> +contentDFLT:"country united states", contentDFLT:street 
> contentDFLT:mains]
>
> probably " quotations are messing this up as You were saying...
> Best regards
>
>
> On 6/10/19 12:48 PM, Tomoko Uchida wrote:
>> Or, " (double quotation) in your query string may affect query parsing.
>>
>> When I parse this string by classic query parser (lucene 8.1),
>> street="MAINS~"
>> parsed (raw) query is
>> text:street text:mains
>> (I set the default search field to "text", so text:xxxx is appeared 
>> here.)
>>
>> Query parsing is a complex process, so it would be good to check
>> parsed raw query string especially when you have (reserved) special
>> characters in your query...
>>
>> 2019年6月11日(火) 1:10 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:
>>> Hi,
>>>
>>> I noticed one small thing in your previous mail.
>>>
>>>> when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>> which is good.
>>>
>>> To specify a search field, ":" (colon) should be used instead of "=".
>>> See the query parser documentation:
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lucene.apache.org_core_8-5F1-5F0_queryparser_org_apache_lucene_queryparser_classic_package-2Dsummary.html-23Fields&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=u4SeJqH4lePhOazCLwxLEr3WqcMkODtYLv4njiKZ4PM&s=WrNfUXO9gz1PqpczTJw1vD9sWqvr76WRv2Aeo9uWqa4&e=

>>>
>>>
>>> I'm not sure this is related to your problem.
>>>
>>> 2019年6月11日(火) 0:51 <baris.kazar@oracle.com>:
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "region=\"NEW HAMPSHIRE\""), BooleanClause.Occur.MUST);
>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer, field,
>>>> "country=\"UNITED STATES\""), BooleanClause.Occur.MUST);
>>>>
>>>>           org.apache.lucene.queryparser.classic.QueryParser parser

>>>> = new
>>>> org.apache.lucene.queryparser.classic.QueryParser(field, 
>>>> phraseAnalyzer) ;
>>>>           Query q1 = null;
>>>>           try {
>>>>               q1 = parser.parse("MAIN");
>>>>           } catch (ParseException e) {
>>>>
>>>>               e.printStackTrace();
>>>>           }
>>>>           booleanQuery.add(q1, BooleanClause.Occur.SHOULD);
>>>>
>>>> testQuerySearch2 Time to compute: 0 seconds
>>>> Number of results: 1775
>>>> Name: Main St
>>>> Score: 37.20959
>>>> ID: 12681979
>>>> Country Code: US
>>>> Coordinates: 42.76416, -71.46681
>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>
>>>> Name: Main St
>>>> Score: 37.20959
>>>> ID: 12681977
>>>> Country Code: US
>>>> Coordinates: 42.747, -71.45957
>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>
>>>> Name: Main St
>>>> Score: 37.20959
>>>> ID: 12681978
>>>> Country Code: US
>>>> Coordinates: 42.73492, -71.44951
>>>> Search Key: street="MAIN" city="NASHUA" municipality="HILLSBOROUGH"
>>>> region="NEW HAMPSHIRE" country="UNITED STATES"
>>>>
>>>>    when i use q1 = parser.parse("street=\"MAIN\""); i get same results
>>>> which is good.
>>>>
>>>> But when i switch to MAINS~ then fuzzy query does not work.
>>>>
>>>>
>>>> i need to say something with the q1 only in the booleanquery:
>>>> it tries to match the MAIN in street, city, region and country 
>>>> which are
>>>> in a single TextField field.
>>>> But i dont want this. that is why i need to street="..." etc when 
>>>> searching.
>>>>
>>>> Best regards
>>>>
>>>>
>>>>
>>>> On 6/10/19 11:31 AM, Tomoko Uchida wrote:
>>>>> Hi,
>>>>>
>>>>> just for the basic verification, can you find the document without
>>>>> fuzzy query? I mean, does this query work for you?
>>>>>
>>>>> Query query = parser.parse("MAIN");
>>>>>
>>>>> Tomoko
>>>>>
>>>>> 2019年6月11日(火) 0:22 <baris.kazar@oracle.com>:
>>>>>> why cant the second set not work at all?
>>>>>>
>>>>>> it is indexed as Textfield like street="..." city="..." etc.
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 6/10/19 11:23 AM, baris.kazar@oracle.com wrote:
>>>>>>> i dont know how to use Fuzzyquery with queryparser but probably
You
>>>>>>> are suggesting
>>>>>>>
>>>>>>> QueryParser parser = new QueryParser(field, analyzer) ;
>>>>>>> Query query = parser.parse("MAINS~2");
>>>>>>>
>>>>>>> booleanQuery.add(query, BooleanClause.Occur.SHOULD);
>>>>>>>
>>>>>>> am i right?
>>>>>>> Best regards
>>>>>>>
>>>>>>>
>>>>>>> On 6/10/19 10:47 AM, Atri Sharma wrote:
>>>>>>>> I would suggest using a QueryParser for your fuzzy query
before
>>>>>>>> adding it to the Boolean query. This should weed out any
case 
>>>>>>>> issues.
>>>>>>>>
>>>>>>>> On Mon, 10 Jun 2019 at 8:06 PM, <baris.kazar@oracle.com
>>>>>>>> <mailto:baris.kazar@oracle.com>> wrote:
>>>>>>>>
>>>>>>>>       BooleanQuery.Builder booleanQuery = new 
>>>>>>>> BooleanQuery.Builder();
>>>>>>>>
>>>>>>>>       //First set
>>>>>>>>
>>>>>>>>               booleanQuery.add(new FuzzyQuery(new
>>>>>>>>       org.apache.lucene.index.Term(field, "MAINS")),
>>>>>>>>       BooleanClause.Occur.SHOULD);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
field,
>>>>>>>>       "NASHUA"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
field,
>>>>>>>>       "NEW HAMPSHIRE"), BooleanClause.Occur.MUST);
>>>>>>>> booleanQuery.add(Utils.createPhraseQuery(phraseAnalyzer,
field,
>>>>>>>>       "UNITED STATES"), BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>>       // Second set
>>>>>>>>                //booleanQuery.add(new FuzzyQuery(new
>>>>>>>>       org.apache.lucene.index.Term(field, "street=\"MAINS\"")),
>>>>>>>>       BooleanClause.Occur.SHOULD);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>       field, "city=\"NASHUA\""), BooleanClause.Occur.MUST);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>       field, "region=\"NEW HAMPSHIRE\""), 
>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>> //booleanQuery.add(Utils.createPhraseQueryFullText(phraseAnalyzer,
>>>>>>>>       field, "country=\"UNITED STATES\""), 
>>>>>>>> BooleanClause.Occur.MUST);
>>>>>>>>
>>>>>>>>       The first set brings also street with Nashua name.
(NASHUA).
>>>>>>>>
>>>>>>>>       so, to prevent that and since i also indexed with

>>>>>>>> street="..."
>>>>>>>>       city="..." i did the second set but it does not
bring 
>>>>>>>> anything.
>>>>>>>>
>>>>>>>>       createPhraseQuery builds a Phrasequery with one
term 
>>>>>>>> equal to the
>>>>>>>>       string
>>>>>>>>       in the call.
>>>>>>>>
>>>>>>>>       Best regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>       On 6/10/19 10:47 AM, baris.kazar@oracle.com
>>>>>>>>       <mailto:baris.kazar@oracle.com> wrote:
>>>>>>>>       > How do i check how it is indexed? lowecase
or uppercase?
>>>>>>>>       >
>>>>>>>>       > only way is now to by testing.
>>>>>>>>       >
>>>>>>>>       > i am using standardanalyzer.
>>>>>>>>       >
>>>>>>>>       > Best regards
>>>>>>>>       >
>>>>>>>>       >
>>>>>>>>       > On 6/9/19 11:57 AM, Atri Sharma wrote:
>>>>>>>>       >> On Sun, Jun 9, 2019 at 8:53 PM Tomoko
Uchida
>>>>>>>>       >> <tomoko.uchida.1111@gmail.com
>>>>>>>> <mailto:tomoko.uchida.1111@gmail.com>> wrote:
>>>>>>>>       >>> Hi,
>>>>>>>>       >>>
>>>>>>>>       >>> What analyzer do you use for the
text field? Is the 
>>>>>>>> term "Main"
>>>>>>>>       >>> correctly indexed?
>>>>>>>>       >> Agreed. Also, it would be good if you
could post your 
>>>>>>>> actual
>>>>>>>> code.
>>>>>>>>       >>
>>>>>>>>       >> What analyzer are you using? If you are
using 
>>>>>>>> StandardAnalyzer,
>>>>>>>>       then
>>>>>>>>       >> all of your terms while indexing will
be lowercased, 
>>>>>>>> AFAIK, but
>>>>>>>>       your
>>>>>>>>       >> query will not be analyzed until you
run a QueryParser 
>>>>>>>> on it.
>>>>>>>>       >>
>>>>>>>>       >>
>>>>>>>>       >> Atri
>>>>>>>>       >>
>>>>>>>>       >
>>>>>>>>       >
>>>>>>>>       >
>>>>>>>> ---------------------------------------------------------------------

>>>>>>>>
>>>>>>>>       > To unsubscribe, e-mail: 
>>>>>>>> java-user-unsubscribe@lucene.apache.org
>>>>>>>> <mailto:java-user-unsubscribe@lucene.apache.org>
>>>>>>>>       > For additional commands, e-mail:
>>>>>>>>       java-user-help@lucene.apache.org
>>>>>>>> <mailto:java-user-help@lucene.apache.org>
>>>>>>>>       >
>>>>>>>>
>>>>>> ---------------------------------------------------------------------

>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message