lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Upayavira ...@odoko.co.uk>
Subject Re: Avoid re indexing
Date Sat, 01 Aug 2015 22:04:04 GMT


On Sat, Aug 1, 2015, at 10:30 PM, naga sharathrayapati wrote:
> I have an exception with one of the document after indexing 6 mil
> documents
> out of 10 mil, is there any way i can avoid re indexing the 6 mil
> documents?

How are you indexing your documents? Are you using the DIH? Personally,
I'd recommend you write your own app to push your content to Solr, then
you will be able to control exceptions more precisely and have the
behaviour you expect.

> I also see that there are few documents that are deleted (based on the
> count) while indexing, is there a way to identify what are those
> documents?

If you see deleted documents but are not actually deleting any, this
will be because you have updated documents with an existing ID. An
update is actually a delete followed by an insert.

> can i add shard to a collection without re indexing?

You cannot just add a new shard to an existing collection (at least, one
that is using the compositeId router (the default). If a shard is too
large, you will need to split an existing shard, which you can do with
the collections API.

It is much better though, to start with the right number of shards if at
all possible.

Upayavira

Mime
View raw message