lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aditya <findbestopensou...@gmail.com>
Subject Re: Advise on an architecture with lot of cores
Date Thu, 09 Oct 2014 07:44:41 GMT
Hi Manoj

There  are advantages in both the approach. I recently read an article,
http://lucidworks.com/blog/podcast-solr-at-scale-at-aol/ . AOL uses Solr
and it uses one core per user.

Having one core per customer helps you
1. Easily migrate / backup the index
2. Load the core as and when required. When user has signed in, load his
index otherwise you don't need to keep his data in memory.
3. Rebuilding data for particular user is easier

Cons:
1. If most of users are actively siging in and you need to load most of the
cores all the time then it will reduce the search time.
2. Each core will have some set of files and there could be situitation
where you will end up in too many files open exception. (We faced this
scenario).


Having single core for all
1. This reduces the headache of user specific stuff and sees the DB / index
as a black box, where you could query for all
2. When the load is more, shard it

Cons:
1. Rebuilding index will take more time

Regards
Aditya
www.findbestopensource.com






On Tue, Oct 7, 2014 at 8:01 PM, Manoj Bharadwaj <mbharadwaj@gmail.com>
wrote:

> Hi Toke,
>
> I don't think I answered your question properly.
>
> With the current 1 core/customer setup many cores are idle. The redesign we
> are working on will move most of our searches to being driven by SOLR vs
> database (current split is 90% database, 10% solr). With that change, all
> cores will see traffic.
>
> We have 25G data in the index (across all cores) and they are currently in
> a 2 core VM with 32G memory. We are making some changes to the schema and
> the analyzers and we see the index size growing by 25% or so due to this.
> And to support this we will be moving to a VM with 4 cores and 64G memory.
> Hardware as such isn't a constraint.
>
> Regards
> Manoj
>
> On Tue, Oct 7, 2014 at 8:47 AM, Toke Eskildsen <te@statsbiblioteket.dk>
> wrote:
>
> > On Tue, 2014-10-07 at 14:27 +0200, Manoj Bharadwaj wrote:
> > > My team inherited a SOLR setup with an architecture that has a core for
> > > every customer. We have a few different types of cores, say "A", "B",
> C",
> > > and for each one of this there is a core per customer - namely "A1",
> > > "A2"..., "B1", "B2"... Overall we have over 600 cores. We don't know
> the
> > > history behind the current design - the exact reasons why it was done
> the
> > > way it was done - one probable consideration was to ensure a customer
> > data
> > > separate from other.
> >
> > It is not a bad reason. It ensures that ranked search is optimized
> > towards each customer's data and makes it easy to manage adding and
> > removing customers.
> >
> > > We want to go to a single core per type architecture, and move on to
> > SOLR
> > > cloud as well in near future to achieve sharding via the features cloud
> > > provides.
> >
> > If the setup is heavy queried on most of the cores or is there are
> > core-spanning searches, collapsing the user-specific cores into fewer
> > super-cores might lower hardware requirements a bit. On the other hand,
> > it most of the cores are idle most of the time, the 1 core/customer
> > setup would be give better utilization of the hardware.
> >
> > Why do you want to collapse the cores?
> >
> > - Toke Eskildsen, State and University Library, Denmark
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message