lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carsten Schnober <>
Subject Update a bunch of documents
Date Thu, 11 Apr 2013 15:46:08 GMT
I have the following scenario: I have an index of very large size
(although I'm testing with around 200,000 documents, but should scale to
many millions) and I want to perform a search on a certain field.
According to that search, I would like to manipulate a different field
for all the matching documents.
The only approach I could come up with so far would be to load the
matching documents ids into a Collector, iterate over them, load the
Document objects with IndexReader.document(docid), and manipulate them
one by one. Finally, I would delete all the documents matching the
initial query with IndexWriter.deleteDocuments(Query query) and write
the edited ones with IndexWriter.addDocuments(Iterable<? extends
Iterable<? extends IndexableField>> docs)

However, the iteration seems to be very time-consuming as it can concern
large portions of the indexed documents and I wonder if there is a
smarter way to perform the document manipulation. This is limited to one
field only (not the one on which the query is typically performed!),
shouldn't that help?


Institut für Deutsche Sprache |
Projekt KorAP                 |
Tel. +49-(0)621-43740789      |
Korpusanalyseplattform der nächsten Generation
Next Generation Corpus Analysis Platform

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message