hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13946) Document how HDFS updates timestamps in the FS spec; compare with object stores
Date Tue, 03 Jan 2017 21:57:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796303#comment-15796303

Chris Nauroth commented on HADOOP-13946:

Sorry to come late to the review, but I would have liked to see a mention of how HDFS rename
updates the modification time of both the source and the destination folder (though not the
modification time of the renamed file itself).

Also, regarding this:
Object stores have a significantly simpler view of time:
 + A file's modification time is always the same as its creation time.

This makes it sound like this section covers all object stores, but the statement about modification
time is not necessarily true universally.  For example, on WASB, the {{FileStatus}} on read
is always populated with the last modified time field of the blob as reported by the Azure
Storage service.  I think any kind of modification of the blob will result in a change in
that value.  I specifically tested {{hadoop fs -chmod}} against WASB, and it updated the blob's
modification time, which is different from HDFS.  Out-of-band blob modifications directly
through the Azure Storage service, bypassing the {{FileSystem}} API, could be another source
of perceived changes in the last modification time.

I expect this is not consistent across services, and therefore it's unlikely we can make accurate
statements in the file system spec beyond just saying "it's different."  :-)

Please feel free to address this either by reverting and revising or filing a new JIRA to
track an addendum.


> Document how HDFS updates timestamps in the FS spec; compare with object stores
> -------------------------------------------------------------------------------
>                 Key: HADOOP-13946
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13946
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: documentation, fs
>    Affects Versions: 2.7.3
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>             Fix For: 2.8.0, 3.0.0-alpha2
>         Attachments: HADOOP-13946-001.patch
> SPARK-17159 shows that the behavior of when HDFS updates timestamps isn't well documented.
Document these in the FS spec.
> I'm not going to add tests for this, as it is so very dependent on FS implementations,
as in "POSIX filesystems may behave differently from HDFS". If someone knows what happens
there, their contribution is welcome.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message