lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Rochkind <rochk...@jhu.edu>
Subject Re: Process entire result set
Date Thu, 05 Aug 2010 23:59:43 GMT
Eloi Rocha wrote:
> Hi everybody,
>
> I would like to know if does make sense to use Solr in the following
> scenario:
>   - search for large amount of data (like 1000, 10000, 100000 registers)
>   - each register contains four or five fields (strings and integers)
>   - every time will request for entire result set (I can paginate the
> results). It would be much better to get all results at once [...]
>   

Depends on what kinds of searching you're doing. Are you doing searching 
that needs an indexer like Solr?  Then Solr is a good tool for your job. 
  Are you not, and you can do what you want just as easily in an rdbms 
or non-sql store like MongoDB? Then I wouldn't use Solr.

Assuming you really do need Solr, I think this should work, but I would 
not store the actual stored fields in Solr, I'd store those fields in an 
external store (key-value store, rdbms, whatever).   You store only what 
you need to index in Solr, you do your search, you get ID's back.  You 
ask for the entire result set back, why not.  If you give Solr enough 
RAM, and set your cache settings appropriately (really big document and 
related caches), then I _think_ it should perform okay. One way to find 
out.

What you'd get back is just ID's, then you'd look up that ID in your 
external store to get your actual fields you want to operate on. _May_ 
not be neccesary, maybe you could do it with solr stored fields, but 
making Solr do only exactly what you really need from it (an index) will 
maximize it's ability to do what you need in available RAM.

If you don't need Solr/Lucene indexing/faceting behavior, and you can do 
just fine with an rdbms or non-sql store, use that.

Jonathan

Mime
View raw message