lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Aristov <>
Subject solr keep old docs
Date Mon, 26 Dec 2011 20:26:24 GMT
Hi people,

I urgently need your help!

I have solr 3.3 configured and running. I do uncremental indexing 4 times a
day using bulk updates. Some documents are identical to some extent and I
wish to skip them, not to index.
But here is the problem as I could not find a way to tell solr ignore new
duplicate docs and keep old indexed docs. I don't care that it's new. Just
determine by ID that such document is in the index already and that's it.

I use solrj for indexing. I have tried setting overwrite=false and dedupe
apprache but nothing helped me. I either have that a newer doc overwrites
old one or I get duplicate.

I think it's a very simple and basic feature and it must exist. What did I
make wrong or didn't do?

Tried google but I couldn't find a solution there althoght many people
encounted such problem.

I start considering that I must query index to check if a doc to be added
is in the index already and do not add it to array but I have so many docs
that I am affraid it's not a good solution.

Best Regards
Alexander Aristov

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message