lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Evert Wagenaar <evert.wagen...@gmail.com>
Subject Re: How to get the terms matching a WildCardQuery in Lucene 6.2?
Date Tue, 25 Oct 2016 17:56:42 GMT
Again, the Code I try to use to extract the matching term for the query
"aard????" This matches one term in my 350.000 words list. Which I Indexed
using the *StandardAnalyzer*.

As already mentioned this matches "aardvark".

What can I do to make this work?


Thanks,

Evert Wagenaar

http://www.evertwagenaar.tk

Evert  Wagenaar

On Tue, Oct 25, 2016 at 1:58 AM, Evert Wagenaar <evert.wagenaar@gmail.com>
wrote:

> Thanks Allison. I will try it.
>
>
> Op maandag 24 oktober 2016 heeft Allison, Timothy B. <tallison@mitre.org>
> het volgende geschreven:
>
>> Make sure to setRewriteMethod on the MultiTermQuery to:
>>  MultiTermQuery.SCORING_BOOLEAN_REWRITE or CONSTANT_SCORE_BOOLEAN_REWRITE
>>
>> Then something like this should work:
>>
>>         q = q.rewrite(reader);
>>
>>         Set<Term> terms = new HashSet<>();
>>         Weight weight = q.createWeight(searcher, false);
>>
>>         weight.extractTerms(terms);
>>
>>
>>
>> -----Original Message-----
>> From: Evert Wagenaar [mailto:evert.wagenaar@gmail.com]
>> Sent: Monday, October 24, 2016 12:41 PM
>> To: java-user@lucene.apache.org
>> Subject: How to get the terms matching a WildCardQuery in Lucene 6.2?
>>
>> I already asked this on StackOverflow. Unfortunately without any answer
>> for over a week now.
>>
>> Therefore again to the real experts:
>>
>>
>> I downloaded a list of 350.000 English words in a .txt file and Indexed
>> it using the latest Lucene (6.2). I want to apply wildcard queries like
>> aard???? and then retreive a list of matches.
>>
>> I've done this before in an older version of Lucene. Here it was pretty
>> simple. I just had to do a Query.rewrite() and this retuned what I needed.
>> Unfortunately in 6.2 this doesn't work anymore. There is a
>> Query.rewrite(Indexreader reader) which should return a HashMap of Terms.
>> In my case there's only one matching Term (aardvark). The Searcher
>> returns one hit, containing the Document path to the wordlist. The HashMap
>> is however empty.
>>
>> When I change the Query to find more then one single match (like aa*) the
>> HashMap remains empty.
>>
>> I tried the MatchExtractor too. Unfortunately without result.
>>
>> The Objective of this is to demonstrate the power of Lucene to easily
>> find words of a particular length, given one or more characters. I'm pretty
>> sure I can do this using regular expressions in Java but then it's outside
>> my objective.
>>
>> Can anyone tell me why this isn't working? I use the StandardAnalyzer.
>> Should I use a different Application?
>>
>> Any help is greatly appreciated.
>>
>> Thanks.
>>
>>
>>
>> --
>> Sent from Gmail IPad
>>
>
>
> --
> Sent from Gmail IPad
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message