hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [hadoop] noslowerdna commented on a change in pull request #666: HADOOP-16221 add option to fail operation on metadata write failure
Date Mon, 01 Apr 2019 21:51:42 GMT
noslowerdna commented on a change in pull request #666: HADOOP-16221 add option to fail operation
on metadata write failure
URL: https://github.com/apache/hadoop/pull/666#discussion_r271064874
 
 

 ##########
 File path: hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
 ##########
 @@ -183,6 +186,40 @@ removed on `S3AFileSystem` level.
 </property>
 ```
 
+#### Fail on Error
+
+By default, S3AFileSystem write operations will still succeed when updates to
+S3Guard metadata fail. S3AFileSystem first writes the file to S3 and then
+updates the metadata in S3Guard. If the metadata write fails, an error is
+logged, but the overall write operation returns successfully. The file in
+S3 **is not** rolled back.
+
+This is somewhat dangerous as it could result in the type of issue S3Guard is
+designed to avoid.  For example, a reader may see an inconsistent listing after
+a recent write since S3Guard may not contain metadata about the recently
+written file due to a metadata write error.
+
+This behavior can be changed by setting the following configuration:
+
+```xml
+<property>
+    <name>fs.s3a.metadatastore.fail.on.write.error</name>
+    <value>true</value>
+</property>
+```
+
+When set to true, a failure to save the metadata will fail the overall write
+operation with `MetadataPersistenceException`. As with the default setting,
+the new/updated file is still in S3 and **is not** rolled back. The S3Guard
+metadata may (is likely to) be out of sync.
+
+The S3Guard metadata for the given file can be corrected with a command like
 
 Review comment:
   Might prefix this with something like "If the write operation cannot be programmatically
retried, ..." since that would probably be the preferred remedy for this exception.  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message