lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Serebrennikov <>
Subject Re: VOTE: Possible features for next release
Date Thu, 23 May 2002 23:46:19 GMT
>2.I see a lot of "problems" when Searching and Updating on the same index. May be is just
me, but what i discovered is:
> a)It is not possible "update" a document, it is possible just delete and re-add, that
mean open a Reader, do a delete, close the reader, open a writer, add the document, optimize
, close the writer.
>So it is possible move the "delete" method from the IndexReader to the IndexWriter? Or
it is impossible for tech. reasons? In this way we open just the Writer to do update,delete
and add documents. This is useful when the index needs to be updated often.
> b)There is no way to update just a field in a document,you need to update the entire
document, so a field update will be good,may be this is hard to do.
The (a) and maybe the (b) is also on my wish/todo list. The approach I 
was thinking of taking was to make these "transactional", so that both 
changes are done on an IndexWriter and they take effect when the writer 
closes. The old document in an older segment gets "shadowed" by the new 
one until the optimization. Once optimized, the old document is gone for 
good. I think this will solve the issue of updates being combersome and 
will also make a lot of the concurrency headaches go away.

Another feature on my list is reduction in the number of open files. 
This is especially a problem in cases where many indexes are in use at 
the same time. The approach I'm thinking of here is to merge all files 
of a given segment into a single file after that segment is closed. 
Since the segment is never written to after it's first created (besides 
the deleted file?), there should be no problem with fragmentation and 
growth. This can be done as part of the segment merge, so that the 
output of a merge is this new single-file segment rather then the 
existing multi-file one. For extra credit, we can add support for having 
1 to n file handles allocated to this one-file segment, in case OS will 
be able to do better optimization of non-contegious reads when multiple 
file handles are used. Does anyone know?


To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message