hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabor Bota (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HADOOP-16423) S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
Date Tue, 06 Aug 2019 11:11:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900897#comment-16900897
] 

Gabor Bota edited comment on HADOOP-16423 at 8/6/19 11:10 AM:
--------------------------------------------------------------

Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a warn
 * version id check is removed because if we want to have version id for an object then we
need to do a HEAD req for that object. We walk the tree on S3 directory listings which is
far more efficient in terms of the number of requests - we do only one request per-directory
and only do this for directories. For version id we should do a request for every single object,
so it will be removed altogether for now.
 * Error messages will stay in the handlers instead of adding those to the enums: we need
to access the pair (the FileStatus both from the MS and S3) when writing the log message to
log where's the error and we need to log parts of the filestatus for showing eg. a mismatch.
 * Added AUTHORITATIVE_DIRECTORY_CONTENT_MISMATCH violation.
 * Teardown in itests: close rawfs


was (Author: gabor.bota):
Based on an offline discussion what's coming up:
 * Severity added to the violation type enums (3 levels, defined in the enum)
 ** eg. Etag mismatch is defined as serious but etag missing is defined as a warn
 * version id check is removed because if we want to have version id for an object then we
need to do a HEAD req for that object. We walk the tree on S3 directory listings which is
far more efficient in terms of the number of requests - we do only one request per-directory
and only do this for directories. For version id we should do a request for every single object,
so it will be removed altogether for now.
 * Error messages added to the enums instead of the violation handler.
 * Added AUTHORITATIVE_DIRECTORY_CONTENT_MISMATCH violation.
 * Teardown in itests: close rawfs

> S3Guarld fsck: Check metadata consistency from S3 to metadatastore (log)
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-16423
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16423
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Gabor Bota
>            Assignee: Gabor Bota
>            Priority: Major
>
> This part is only for logging the inconsistencies.
> This issue only covers the part when the walk is being done in the S3 and compares all
metadata to the MS.
> There will be no part where the walk is being done in the MS and compare it to the S3.




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message