lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thorsten Scherler <thorsten.scherler....@juntadeandalucia.es>
Subject big index vs. lots of small ones
Date Wed, 20 Jan 2010 12:55:10 GMT
Hi all,

I have to do an analyses about following usecase.

I am working as consultant in a public company. We are talking about to
offer in the future each public institution its own search server
(probably) based on Apache Solr. However the user of our portal should
be able to search all indexes.

The problematic part for our customer is that a meta search on various
indexes which then later merges the response will change the scoring.

Imagine you have the two indexes
- public health department (A)
- press relations department (B)

Now you have 300 documents in A and only one in B about "influenza A".
The B server will return the only document in its index with a very high
score, since being the only one it gets a very high "base" score,
correct?

On the other hand A may have much more important documents but they will
not get the same "base" score.

Meaning on a merge most likely the document from Server B will be top of
the list.

To prevent this phenomenon we are looking into merging all the
standalone indexes in on big index but that will lead us in other
problems because it will become pretty big pretty fast.

So here my questions:

- What are other people doing to solve this problem?
- What is the best way with Solr to solve the problem of the "base"
scoring?
- What is the best way to have multiple indexes in solr?
- Is it possible to get rid of the "base" scoring in solr?

TIA for any informations.

salu2
-- 
Thorsten Scherler <thorsten.at.apache.org>
Open Source Java <consulting, training and solutions>

Sociedad Andaluza para el Desarrollo de la Sociedad 
de la InformaciĆ³n, S.A.U. (SADESI)





Mime
View raw message