lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Upayavira ...@odoko.co.uk>
Subject Re: Avoid re indexing
Date Sat, 01 Aug 2015 22:43:38 GMT


On Sat, Aug 1, 2015, at 11:29 PM, naga sharathrayapati wrote:
> I am using solrj to index documents
> 
> i agree with you regarding the index update but i should not see any
> deleted documents as it is a fresh index. Can we actually identify what
> are
> those deleted documents?

If you post doc 1234, then you post doc 1234 a second time, you will see
a deletion in your index. If you don't want deletions to show in your
index, be sure NEVER to update a document, only add new ones with
absolutely distinct document IDs.

You cannot see (via Solr) which docs are deleted. You could, I suppose,
introspect the Lucene index, but that would most definitely be an expert
task.

> if there is no option of adding shards to existing collection i do not
> like
> the idea of re indexing the whole data (worth hours) and we have gone
> with
> good number of shards but there is a rapid increase of size in data over
> the past few days, do you think is it worth logging a ticket?

You can split a shard. See the collections API:

https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api3

What would you want to log a ticket for? I'm not sure that there's
anything that would require that.

Upayavira

Mime
View raw message