lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From petite_abeille <>
Subject Re: index: how to store binary data or objects ?
Date Tue, 10 Feb 2004 12:35:00 GMT

On Feb 10, 2004, at 09:32, Andrzej Bialecki wrote:

> Just a comment: for ext2fs and BSD FFS (dunno about NT) scalability 
> issues with this approach can be partially addressed by building a 
> tree of subdirectories, instead of using just one. I.e. a file named 
> "myThesis.pdf" would go into /m/y/t/myThesis.pdf. This way the time 
> needed to list the files in a given directory is reduced (both unixes 
> can already cache the inode numbers for name/inode lookup, so there is 
> no significant time increase to lookup a longer path).

Yes. But you have to watch out for overall path length limit though. An 
alternative strategy is to hash your keys and store that as the 
directory path. This what some browsers do to store their cache.

> FreeBSD also has a special kind of filesystem, which uses inodes in a 
> flat space (no directories). It was specifically designed for storing 
> large numbers of files efficiently. Recent versions of Java on FreeBSD 
> (1.4.2) seem to be very stable and performing well, so that could also 
> be an option.

Yes. The file system can be used to simulate a fairly reasonable 
database. The problem then is not so much the time it takes to look up 
those files, but rather opening, reading, and closing them.

> After all, a filesystem _is_ a kind of very specialized database... ;-)

This is true and works quite well to a certain extend.

But it suffers from one major flaw in my experience: you run out of 
file descriptors very quickly. And lets face it, it's quite slow also 



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message