hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HADOOP-17281) Implement FileSystem.listStatusIterator() in S3AFileSystem
Date Thu, 01 Oct 2020 06:57:00 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-17281?focusedWorklogId=493316&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493316
]

ASF GitHub Bot logged work on HADOOP-17281:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Oct/20 06:56
            Start Date: 01/Oct/20 06:56
    Worklog Time Spent: 10m 
      Work Description: mukund-thakur opened a new pull request #2354:
URL: https://github.com/apache/hadoop/pull/2354


   Ran the new test using ap-south-1 bucket. 
   
   O/P- 
   `(ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using listFiles()
api with batch size of 10 including 10ms of processing time for each file: 12,223,848,028
nS
   2020-10-01 12:19:28,811 [JUnit-testMultiPagesListingPerformanceAndCorrectness] INFO  contract.ContractTestUtils
(ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using listStatus() api
with batch size of 10 including 10ms of processing time for each file: 15,988,037,357 nS
   2020-10-01 12:19:41,050 [JUnit-testMultiPagesListingPerformanceAndCorrectness] INFO  contract.ContractTestUtils
(ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using listStatusIterator()
api with batch size of 10 including 10ms of processing time for each file: 12,214,813,052
nS`
   
   From the logs we can see that time taken using listStatusIterator() and listFiles() matches
and is less than listStatus().


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 493316)
    Remaining Estimate: 0h
            Time Spent: 10m

> Implement FileSystem.listStatusIterator() in S3AFileSystem
> ----------------------------------------------------------
>
>                 Key: HADOOP-17281
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17281
>             Project: Hadoop Common
>          Issue Type: Task
>          Components: fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Mukund Thakur
>            Assignee: Mukund Thakur
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently S3AFileSystem only implements listStatus() api which returns an array. Once
we implement the listStatusIterator(), clients can benefit from the async listing done recently 
> https://issues.apache.org/jira/browse/HADOOP-17074  by performing some tasks on files
while iterating them.
>  
> CC [~stevel]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message