lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthias Müller <matthias_muel...@tu-dresden.de>
Subject Re: RamDirectory vs MemoryIndex vs MMapDirectory for In-Memory-Index
Date Tue, 25 Sep 2018 12:32:01 GMT
Thanks Dawid, glad I asked!

Am Dienstag, den 25.09.2018, 10:46 +0200 schrieb Dawid Weiss:
> Use MMapDirectory on a temporary location, Matthias. If you really
> need in-memory indexes, a new Directory implementation is coming
> (RAMDirectory will be deprecated, then removed), but the difference
> compared to MMapDirectory is typically not worth the hassle. See this
> issue for more discussion.
> 
> https://issues.apache.org/jira/browse/LUCENE-8438
> 
> Dawid
> On Tue, Sep 25, 2018 at 10:44 AM Matthias Müller
> <matthias_mueller@tu-dresden.de> wrote:
> > 
> > Hi,
> > 
> > Lucene provides different storage options for in-memory indexes. I
> > found three structures that would qualify for the task:
> > 
> > * RamDirectory (which I currently use for prototyping, but wonder
> > if it
> > is the ideal choice for my task)
> > * MemoryIndex, which claims to have better performance and resource
> > use
> > for small documents
> > * MMapDirectory which should outperform RamDirectory for huge
> > indices
> > (what is "huge?")
> > 
> > 
> > My plan is to periodically index some properties (string codes,
> > longs,
> > lat/lng points) of a larger database content with Lucene for
> > quicker
> > lookups (compared to slow SQL queries).
> > 
> > What would be the most efficient (or intended) storage option for
> > such
> > an index in terms of lookup speed and CPU/memory use? Below [1] is
> > a
> > brief summary of the index contents and I hope these figures are
> > sufficient to get a recommendation. But I am also happy to study
> > more
> > detailed documentation on the matter.
> > 
> > - Matthias
> > 
> > [1]: Summary of index contents and intended use
> > * Total documents: 500.000 - 1.000.000, may grow to 10.000.000
> > records
> > in  mid future.
> > * Document fields (all of them single value fields):
> >     * String (9x), usually 1-10 characters long, mostly recurring
> > values (5% distinct)
> >     * LongPoint (4x), two fields contain mostly distinct values,
> > one
> > lostly recurring values (5-10% distinct), one field acts as a
> > primary
> > key
> >     * LatLonPoint (1x), 30% distinct
> > * Refresh interval: 1..5 minutes (I currently create a fresh index
> > instance on each update and discard the old one)
> > * Most queries are range queries and exact matches on several
> > properties, sometimes I need to retrieve the property fields of a
> > single document based on a primary key value.
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message