lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] [Commented] (LUCENE-6241) don't filter subdirectories in listAll()
Date Thu, 12 Feb 2015 15:43:12 GMT


Robert Muir commented on LUCENE-6241:

Another option would be to change this signature from RAMDirectory(Directory other) to RAMDirectory(FSDirectory
other). FSDirectory already has 'Path getDirectory()' so RAMDirectory could use this directly,
to ignore subdirectories.

> don't filter subdirectories in listAll()
> ----------------------------------------
>                 Key: LUCENE-6241
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>         Attachments: LUCENE-6241.patch, LUCENE-6241.patch
> The issue is, today this means listAll() is always slow, sometimes MUCH slower, because
it must do the fstat()-equivalent of each file to check if its a directory to filter it out.
> When i benchmarked this on a fast filesystem, doing all these filesystem metadata calls
only makes listAll() 2.6x slower, but on a non-ssd, slower i/o, it can be more than 60x slower.
> Lucene doesn't make subdirectories, so hiding these for abuse cases just makes real use
cases slower.
> To add insult to injury, most code (e.g. all of lucene except for where RAMDir copies
from an FSDir) does not actually care if extraneous files are directories or not.
> Finally it sucks the name is listAll() when it is doing anything but that.
> I really hate to add a method here to deal with this abusive stuff, but I'd rather add
isDirectory(String) for the rare code that wants to filter out, than just let stuff always
be slow.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message