lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Lucene TermsFilter lookup slow
Date Sat, 08 Aug 2015 15:13:14 GMT
Which version of Lucene are you using?  Newer versions have optimized
the "primary key" use case somewhat...

Mike McCandless

http://blog.mikemccandless.com


On Sat, Aug 8, 2015 at 8:32 AM, jamie <jamie@stimulussoft.com> wrote:
> Greetings
>
> Our app primarily uses Lucene for its intended purpose i.e. to search across
> large amounts of unstructured text. However, recently our requirement
> expanded to perform look-ups on specific documents in the index based on
> associated custom defined unique keys. For our purposes, a unique key is the
> string representation of a 128 bit murmur hash, stored in a Lucene field
> named uid.  We are currently using the TermsFilter to lookup Documents in
> the Lucene index as follows:
>
> List<Term> terms = new LinkedList<>();
>             for (String id : ids) {
>                 terms.add(new Term("uid", id));
> }
> TermsFilter idFilter = new TermsFilter(terms);
> ... search logic...
>
> At any time we may need to lookup say a couple of thousand documents. Our
> problem is one of performance. On very large indexes with 30 million records
> or more, the lookup can be excruciatingly slow. At this stage, its not
> practical for us to move the data over to fit for purpose database, nor
> change the uid field to a numeric type. I fully appreciate the fact that
> Lucene is not designed to be a database, however, is there anything we can
> do to improve the performance of these look-ups?
>
> Much appreciate
>
> Jamie
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message