lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Reuschling <reuschl...@dfki.uni-kl.de>
Subject Re: query.extractTerms(..) on rewritten queries
Date Tue, 07 Oct 2014 10:18:31 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

It works now - I simply applied MultiTermQuery.CONSTANT_SCORE_BOOLEAN_QUERY_REWRITE to all
(sub)MultiTermQueries in the query given from QueryParser. Then, Query.extractTerms works
again.

Someone only have to be clear that this gives BooleanQuery.getMaxClauseCount() terms at maximum.

Performance could be improved by subclassing an own RewriteMethod.

Thanks a lot

Chris


On 06.10.2014 19:35, Uwe Schindler wrote:
> Hi,
> 
> Lucene no longer rewrites Wildcards and other MultiTermQueries to Boolean Queries by
default,
> because execution of such queries is too slow. If you just want to get the filtered index
> terms, it is easiest to write your own MultiTermQuery.RewriteMethod that collects all
terms
> while rewriting and set it on the MultiTermQuery variant and rewrite it to collect the
terms.
> 
> Uwe
> 
> ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail:
> uwe@thetaphi.de
> 
>> -----Original Message----- From: Christian Reuschling [mailto:reuschling@dfki.uni-kl.de]

>> Sent: Monday, October 06, 2014 6:06 PM To: java-user@lucene.apache.org Subject:
>> query.extractTerms(..) on rewritten queries
>> 
> Hi,
> 
> currently I migrate to Lucene 4. In the past, I did a trick to get the index specific
terms for
> an according (wildcard) query (see below). But it don't works anymore:
> 
> String queryString = "n*"; // gives no result // String queryString = "nöä"; // results,
but
> only the given String - not the real index terms
> 
> Query query = new QueryParser(Version.LUCENE_CURRENT, "body", new 
> StandardAnalyzer(Version.LUCENE_CURRENT)).parse(queryString);
> 
> Query rewritten = searcher.rewrite(query);
> 
> HashSet<Term> hsTerms = new HashSet<>();
> 
> rewritten.extractTerms(hsTerms);
> 
> in the past, searcher.rewrite on a wildcard query results into a new BooleanQuery with
simple
> TermQueries in it, representing the matching primitive terms from the index.
> 
> The javaDoc of the current release sounds to me that this could still work:
> 
> Searcher.rewrite: /**Expert: called to re-write queries into primitive queries.**/
> 
> Query.extractTerms: /**Expert: adds all terms occurring in this query to the terms set.
Only
> works if this query is in its {@link #rewrite rewritten} form.**/
> 
> 
> 
> Thanks in advance!
> 
> Christian
> 
>> 
>> --------------------------------------------------------------------- To unsubscribe,
e-mail:
>> java-user-unsubscribe@lucene.apache.org For additional commands, e-mail:
>> java-user-help@lucene.apache.org
> 
> 
> --------------------------------------------------------------------- To unsubscribe,
e-mail:
> java-user-unsubscribe@lucene.apache.org For additional commands, e-mail:
> java-user-help@lucene.apache.org
> 

- -- 
______________________________________________________________________________
Christian Reuschling, Dipl.-Ing.(BA)
Software Engineer

Knowledge Management Department
German Research Center for Artificial Intelligence DFKI GmbH
Trippstadter Straße 122, D-67663 Kaiserslautern, Germany

Phone: +49.631.20575-1250
mailto:reuschling@dfki.de  http://www.dfki.uni-kl.de/~reuschling/

- ------------Legal Company Information Required by German Law------------------
Geschäftsführung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
                  Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313=
______________________________________________________________________________
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)

iEYEARECAAYFAlQzvfcACgkQ6EqMXq+WZg9CCQCgge0f0v0yl7HKFr80d7yCHwgz
kY0AniYUa/27VS4gIYZ/YzCCp7EHFXha
=HMvr
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message