lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gimantha Bandara <giman...@wso2.com>
Subject Re: How to merge several Taxonomy indexes
Date Thu, 02 Apr 2015 15:27:17 GMT
Hi Shai

Currently I am using a DB, But the platform we are developing needs to
support RDBMS, HBase and other Datasource types for indices to be stored.
So the user should be able to use whatever the underlying filesystem he
wants to use. I am not sure if Solr can support multiple datasource types.
I would like to continue with Lucene with MMapDirectory. Will update if I
have a question.. Thanks a lot!

On Thu, Apr 2, 2015 at 5:39 PM, Shai Erera <serera@gmail.com> wrote:

> MMapDirectory uses memory-mapped files. This is an operating system level
> feature, where even though the file resides on disk, the OS can memory-map
> it and access it more efficiently. It is loaded into memory outside the JVM
> heap, and usually on a properly configured server you should not worry
> about running out of memory, since if the file cannot be brought into
> memory, it's accessed from disk.
>
> You mentioned that you store the index in a DB, which is distributed. Have
> you considered using Solr for managing your distributed index? It might be
> better than storing it in a DB, merging taxonomies for search etc. and Solr
> has quite rich faceted search capabilities.
>
> On Thu, Apr 2, 2015 at 1:51 PM, Gimantha Bandara <gimantha@wso2.com>
> wrote:
>
> > Btw I was using a RAMDirectory for just testing purposes..
> >
> > On Thu, Apr 2, 2015 at 5:16 PM, Gimantha Bandara <gimantha@wso2.com>
> > wrote:
> >
> > > Hi Christoph and Shai,
> > >
> > > Thanks for the quick response!.
> > > Indices are stored in a relational database ( using a custom Directory
> > > implementation ). The Problem comes since the indices are sharded (both
> > > taxonomy indices and normal doc indices), when a user wants to
> > drilldown, I
> > > have to merge all the indices. For that I used mergeUtils (which
> > > worksperfect). For now I am using RAMDirectory as the merged indices.
> > > Anyway The indices can grow to a bigger size as time goes.
> MMapDirectory
> > > again uses memory right? Can It deal with possible out of memory issue?
> > >
> > > I am thinking of using the same Database to store the merged indices.
> But
> > > the problem is the original sharded indices can be updated, when new
> > > entries come in. So the merged final indices also needs to be updated
> > > accordingly.
> > >
> > > On Thu, Apr 2, 2015 at 4:55 PM, Shai Erera <serera@gmail.com> wrote:
> > >
> > >> In some cases, MMapDirectory offers even better performance, since the
> > JVM
> > >> doesn't need to manage that RAM when it's doing GC.
> > >>
> > >> Also, using only RAMDirectory is not safe in that if the JVM crashes,
> > your
> > >> index is lost.
> > >>
> > >> On Thu, Apr 2, 2015 at 12:54 PM, Christoph Kaser <
> > lucene_list@iconparc.de
> > >> >
> > >> wrote:
> > >>
> > >> > Hi Gimantha,
> > >> >
> > >> > why do you use a RAMDirectory? If your merged index fits into RAM
> > >> > completely, a MMapDirectory should offer almost the same
> performance.
> > >> And
> > >> > if not, it is definitely the better choice.
> > >> >
> > >> > Regards
> > >> > Christoph
> > >> >
> > >> >
> > >> > Am 02.04.2015 um 12:38 schrieb Gimantha Bandara:
> > >> >
> > >> >> Hi All,
> > >> >>
> > >> >> I have successfully setup a merged indices and drilldown and usual
> > >> search
> > >> >> operations work perfect.
> > >> >> But, I have a side question. If I selected RAMDirectory as the
> > >> destination
> > >> >> Indices in merging, probably the jvm can go out of memory if the
> > merged
> > >> >> indices are too big. Is there a way I can handle this issue?
> > >> >>
> > >> >> On Tue, Mar 24, 2015 at 12:18 PM, Gimantha Bandara <
> > gimantha@wso2.com>
> > >> >> wrote:
> > >> >>
> > >> >>  Hi Christoph,
> > >> >>>
> > >> >>> My mistake. :) It does the exactly what i need. figured it
out
> > later..
> > >> >>> Thanks a lot!
> > >> >>>
> > >> >>> On Tue, Mar 24, 2015 at 3:14 AM, Gimantha Bandara <
> > gimantha@wso2.com>
> > >> >>> wrote:
> > >> >>>
> > >> >>>  Hi Christoph,
> > >> >>>>
> > >> >>>> I think TaxonomyMergeUtils is to merge a taxonomy directory
and
> an
> > >> index
> > >> >>>> together (Correct me if I am wrong). Can it be used to
merge
> > several
> > >> >>>> taxonomyDirectories together and create one taxonomy index?
> > >> >>>>
> > >> >>>> On Mon, Mar 23, 2015 at 9:19 PM, Christoph Kaser <
> > >> >>>> lucene_list@iconparc.de
> > >> >>>>
> > >> >>>>> wrote:
> > >> >>>>> Hi Gimantha,
> > >> >>>>>
> > >> >>>>> have a look at the class org.apache.lucene.facet.
> > >> >>>>> taxonomy.TaxonomyMergeUtils,
> > >> >>>>> which does exactly what you need.
> > >> >>>>>
> > >> >>>>> Best regards,
> > >> >>>>> Christoph
> > >> >>>>>
> > >> >>>>> Am 23.03.2015 um 15:44 schrieb Gimantha Bandara:
> > >> >>>>>
> > >> >>>>>  Hi all,
> > >> >>>>>>
> > >> >>>>>> Can anyone point me how to merge several taxonomy
indexes? My
> > >> >>>>>> requirement
> > >> >>>>>> is as follows. I have  several taxonomy indexes
and normal
> > document
> > >> >>>>>> indexes. I want to merge taxonomy indexes together
and other
> > >> document
> > >> >>>>>> indexes together and perform search on them. One
part I have
> > >> figured
> > >> >>>>>> out.
> > >> >>>>>> It is easy. To Merge document indexes, all I have
to do is
> > create a
> > >> >>>>>> MultiReader and pass it to IndexSearcher. But
I am stuck at
> > merging
> > >> >>>>>> the
> > >> >>>>>> taxonomy indexes. Is there a way to merge taxonomy
indexes?
> > >> >>>>>>
> > >> >>>>>>
> > >> >>>>>>  --
> > >> >>>>> Dipl.-Inf. Christoph Kaser
> > >> >>>>>
> > >> >>>>> IconParc GmbH
> > >> >>>>> Sophienstrasse 1
> > >> >>>>> 80333 München
> > >> >>>>>
> > >> >>>>> www.iconparc.de
> > >> >>>>>
> > >> >>>>> Tel +49 -89- 15 90 06 - 21
> > >> >>>>> Fax +49 -89- 15 90 06 - 49
> > >> >>>>>
> > >> >>>>> Geschäftsleitung: Dipl.-Ing. Roland Brückner, Dipl.-Inf.
Sven
> > >> Angerer.
> > >> >>>>> HRB
> > >> >>>>> 121830, Amtsgericht München
> > >> >>>>>
> > >> >>>>>
> > >> >>>>>
> > >> >>>>>
> > >> ---------------------------------------------------------------------
> > >> >>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >> >>>>> For additional commands, e-mail:
> java-user-help@lucene.apache.org
> > >> >>>>>
> > >> >>>>>
> > >> >>>>>  --
> > >> >>>> Gimantha Bandara
> > >> >>>> Software Engineer
> > >> >>>> WSO2. Inc : http://wso2.com
> > >> >>>> Mobile : +94714961919
> > >> >>>>
> > >> >>>>  --
> > >> >>> Gimantha Bandara
> > >> >>> Software Engineer
> > >> >>> WSO2. Inc : http://wso2.com
> > >> >>> Mobile : +94714961919
> > >> >>>
> > >> >>>
> > >> >>
> > >> >
> > >> > --
> > >> > Dipl.-Inf. Christoph Kaser
> > >> >
> > >> > IconParc GmbH
> > >> > Sophienstrasse 1
> > >> > 80333 München
> > >> >
> > >> > www.iconparc.de
> > >> >
> > >> > Tel +49 -89- 15 90 06 - 21
> > >> > Fax +49 -89- 15 90 06 - 49
> > >> >
> > >> > Geschäftsleitung: Dipl.-Ing. Roland Brückner, Dipl.-Inf. Sven
> Angerer.
> > >> HRB
> > >> > 121830, Amtsgericht München
> > >> >
> > >> >
> > >> >
> > >> >
> ---------------------------------------------------------------------
> > >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >> >
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Gimantha Bandara
> > > Software Engineer
> > > WSO2. Inc : http://wso2.com
> > > Mobile : +94714961919
> > >
> >
> >
> >
> > --
> > Gimantha Bandara
> > Software Engineer
> > WSO2. Inc : http://wso2.com
> > Mobile : +94714961919
> >
>



-- 
Gimantha Bandara
Software Engineer
WSO2. Inc : http://wso2.com
Mobile : +94714961919

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message