lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From atawfik <>
Subject Large scale Update of solr indexed documents
Date Wed, 17 Dec 2014 08:54:20 GMT
Hi all,

I have a scenario where I need to generate summaries of indexed documents.
So, I initially thought I should do that at Nutch because I am using Nutch
to push documents to Solr. However, I will need some statistics about terms
and documents. Hence, I will have to duplicate analysis at Nutch. Therefore,
Nutch is not the right place to handle that.

I ended up with two potential solutions. The first is to use Solr. However,
I am not sure how to handle that. 

The second solution is actually to read directly from Lucene index, access
whatever statistics i need then generate summary.

The other challenge is that Solr have around 5 millions documents. The
solution needs to be scalable as well. 

Any ideas or thoughts are very much welcome.


View this message in context:
Sent from the Solr - User mailing list archive at

View raw message