lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Advice on updating solr indexes
Date Sun, 16 Aug 2009 00:00:23 GMT
There is a special-purpose feature that solves exactly this problem: it
assigns the score for a particular field from a file which contains every
known value of the field and a matching float.

Doing a quick scan of the code, this seems to be how it works: the
declaration of the field in schema.xml contains the fact that its score is
derived from an external file. It can only be done on fields defined 'float'
(not 'sfloat'). Each query may give a file name and the field name. This
file should be sorted by the values of the field. It will be loaded and
cached and a future query may give only the field name.
Again, this description may not be completely right. (The float/sfloat thing
might be wrong, for example.) The parameters for this feature are buried in
the Solr source. There is no mention of this feature in the wiki.

The files are:

http://svn.apache.org/repos/asf/lucene/solr/trunk/src/java/org/apache/solr/search/function/FileFloatSource.java
and

http://svn.apache.org/repos/asf/lucene/solr/trunk/src/java/org/apache/solr/schema/ExternalFileField.java

I have not tried any of this. Should you try this feature and get it
working, please  document it on the wiki :)  Also, if there are any bugs or
gotchas, please post a Jira issue.
 --
Lance Norskog
goksron@gmail.com

On Sat, Aug 15, 2009 at 7:38 AM, William Pierce <evalsinca@hotmail.com>wrote:

> Folks:
>
> In our app we index approx 50 M documents every so often.  One of the
> fields in each document is called "CompScore" which is a score that our
> back-end computes for each document.  The computation of this score is
> heavy-weight and is done only approximately once every few days.    When
> documents are retrieved during a search we return results sorted by the Solr
> score first and then the CompScore.
>
> The issue we have this:  Every week or so when the back-end routines run to
> compute "CompScore"  we need to delete and insert these 50 M documents into
> the index.   This happens even though the a majority of the documents have
> not changed.
>
> I think there is no way in Solr to simply update a field in the index.
>
> If others have encountered a similar issue,  I'd be interested in hearing
> about their solutions!
>
> Best,
>
> - Bill
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message