hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [hadoop] steveloughran commented on pull request #2354: HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem
Date Thu, 01 Oct 2020 20:12:28 GMT

steveloughran commented on pull request #2354:
URL: https://github.com/apache/hadoop/pull/2354#issuecomment-702371522


   Looks good. Annoying about the return types which force you to do that wrapping/casting.
Can't you just forcibly cast the return type of the inner iterator? after all, type erasure
means all type info will be lost in the actual compiled binary. I'd prefer that as it will
give you automatic passthrough of the IOStatistics stuff.
   
   Add text to filesystem.md, something which: 
   
   * specifies the result is exactly the same a listStatus, provided no other caller updates
the directory during the list
   * declares that it's not atomic and performance implementations will page
   * and that if a path isn't there, that fact may not surface until next/hasNext...that is,
we do lazy eval for all file IO
   
   
   We need to similar new contract tests in AbstractContractGetFileStatusTest for all to use
   
   * that in a dir with files and subdirectories, you get both returned in the listing
   * that you can iterate through with next() to failure as well as hasNext/next, and get
the same results
   * listStatusIterator(file) returns the file
   * listStatusIterator("/") gives you a listing of root (put that in AbstractContractRootDirectoryTest)
   
   And two for changes partway through the iteration
   
   * change the directory during a list to add/delete files
   * deletes the actual path.
   
   These tests can't assert on what will happen, and with paged IO aren't likely to pick up
on changes...there just to show it can be done and pick up on any major issues with implementations.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message