hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [hadoop] steveloughran commented on issue #157: HADOOP-13600. S3a rename() to copy files in a directory in parallel
Date Fri, 26 Apr 2019 19:10:14 GMT
steveloughran commented on issue #157: HADOOP-13600. S3a rename() to copy files in a directory
in parallel
URL: https://github.com/apache/hadoop/pull/157#issuecomment-487167798
 
 
   @harshavardhana I'm doing work on parallel renames in HADOOP-15183; the complication there
is that I want to update the S3Guard metastore after each copy and then maybe do the delete
immediately after too, or, as discussed on that JIRA, postpone all the deletes until every
copy completes.
   
   This is driven mostly by the need to have S3Guard fully resilient to failures during the
operation; speedup with scale are also on that todo list though. 
   
   I need to review @sahilTakiar's patch here to see what I can lift from it -it's clear that
Sahil  was way ahead of the rest of us in using Java 8 idioms we're only now picking up. I
think the ability to abort ongoing uploads is something we'd want to handle a failure -and
which some of the progress callbacks here seems to set us up for. 
   
   pr #654 is the work there.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message