lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Rosen <p...@performantsoftware.com>
Subject relevancy and merging
Date Mon, 26 Oct 2009 21:40:54 GMT
Is there any difference to the relevancy score for a document that has 
been added directly to an index vs. the same document that got into the 
index because of a merge?

In other words, I'd like to build my index in pieces (since people in 
different cities will be working on parts of it), but I want the search 
results to be as if it were one index.

My first thought was to keep the indexes separate and use multicore 
shards to search both indexes. I decided against that because of two things:

1) It is slower.
2) The relevancies are wrong, since the frequency of words is really 
different in the two indexes.

My second thought is to have the people work on separate indexes, and 
merge them together just before going to production. That would 
definitely solve the first problem, but I don't know if it solves the 
second.

I also don't know how to test that myself. I want to build my index both 
ways then do a search and compare the results, but how decisive that is 
depends on the particular words I use in the search. Is there a way to 
dump everything about a particular document, so I could compare the two 
indexes? Are there other tools available that would help?

Thanks for any insight.

Mime
View raw message