lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Poindexter <>
Subject NIO2 Directory implementations
Date Sun, 17 Mar 2013 02:13:56 GMT
As part of a project using Lucene I have implemented a trio of Directories
roughly corresponding to the FSDirectory implementations in core.  These
directory implementations use the NIO2 API's in JDK7 when opening files.
 This ensures that on Windows the files are opened in a mode that allows
deletion even if the file is open elsewhere.

1.) JDK7MMapDirectory - Roughly the same as MMapDirectory.  Uses (instead of RandomAccessFile) to create a FileChannel that
then has map() called on it to create the mapped buffers.
2.) JDK7NIOFSDirectory - Roughly the same as NIOFSDirectory, but uses to create the file channel instead of RandomAccessFile.
3.) JDK7AsyncFSDirectory - This one is new and different.  I needed a
replacement for SimpleFSDirectory (that was not susceptible to problems if
interrupt()'ed) and did not have the synchronization problems on Windows of
NIOFSDirectory.  This one is used where SimpleFSDirectory could have been
used, but uses an AsynchronousFileChannel to do it's work.  The actual
operation is still synchronous, but on Windows AsynchronousFileChannel uses
overlapped IO, and hence does not require synchronization on the position
and should be safe for interrupts.

A couple of questions:
1.)  Is there any interest in me contributing these to Lucene?  They
require JDK7+, but perhaps they could go in a contrib module?
2.)  While implementing these I noticed the implementation of
FSDirectory.sync seems a little strange:  It just opens a new
RandomAccessFile and forces a sync using it.  The JavaDocs seem to imply
that this would force a sync on the file handle associated with the
RandomAccessFile, but that's not the file handle that was written to as
part of an IndexOutput.  On Windows at least this won't matter, but it
seems theoretically wrong...i.e. according to the JavaDoc on a given
platform this style of operation could have no impact if I am understanding
it correctly.  It seems like maybe it would be better to have a sync() call
on an IndexOutput that can be called before closing I missing
something here?
3.)  What is the best way to go about benchmarking/testing these new
implementations to compare against the core FSDirectory implementations?
 I've seen some references to randomized tests and benchmarks on the
developer pages on the Lucene website, but I didn't see anything that was
along the lines of "Here's how to run the benchmarks"...any pointers would
be much appreciated.

Mike Poindexter

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message