commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vincent Bouscasse (JIRA)" <j...@apache.org>
Subject [jira] Commented: (IO-170) Scalable Iterator for files, better than FileUtils.iterateFiles
Date Mon, 30 Nov 2009 13:23:21 GMT

    [ https://issues.apache.org/jira/browse/IO-170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783648#action_12783648
] 

Vincent Bouscasse commented on IO-170:
--------------------------------------

Hi Matthew. 

I'm not sure i got the point but i think Damian was thinking about an iterator that allows
for processing results on the fly as soon as they are available. The code in your patch does
not allow this: we have to wait until the result files are found before the first file object
can be used as a return of iterator.next(). It can be long if we search files in large directory
trees.

I've written a recursive Iterator<File> implementation allowing to get the first matches
as soon as they're discovered. The next match is computed in the hasNext() method and it uses
linked lists to store matches and subdirectories. The complete iteration speed is the same
as the actual one but first results are provided more quickly. This iterator implementation
typical usage is in a producer thread whereas the file processing is done in a consumer thread
allowing to speeding up the file processing

I can provide you a code sample if it can match your needs.

Best regards.




> Scalable Iterator for files, better than FileUtils.iterateFiles
> ---------------------------------------------------------------
>
>                 Key: IO-170
>                 URL: https://issues.apache.org/jira/browse/IO-170
>             Project: Commons IO
>          Issue Type: Improvement
>          Components: Utilities
>    Affects Versions: 1.4
>         Environment: generic file systems
>            Reporter: Damian Noseda
>            Priority: Minor
>             Fix For: 2.x
>
>         Attachments: real_iterators.patch
>
>   Original Estimate: 5h
>  Remaining Estimate: 5h
>
> Improve the way that iterateFiles generate an iterator. The current way it not scale.
It's try to add all files in a list and then return the iterator of that list. A better way
it would be create an customize Iterator<File> with a stack of arrays of File to go
up and down in the directory tree.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message