hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HADOOP-17597) Add option to downgrade S3A rejection of Syncable to warning
Date Fri, 23 Apr 2021 17:45:00 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-17597?focusedWorklogId=588021&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-588021
]

ASF GitHub Bot logged work on HADOOP-17597:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/Apr/21 17:44
            Start Date: 23/Apr/21 17:44
    Worklog Time Spent: 10m 
      Work Description: steveloughran merged pull request #2801:
URL: https://github.com/apache/hadoop/pull/2801


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 588021)
    Time Spent: 2h  (was: 1h 50m)

> Add option to downgrade S3A rejection of Syncable to warning
> ------------------------------------------------------------
>
>                 Key: HADOOP-17597
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17597
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 3.3.1
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> The Hadoop Filesystem Syncable API is intended to meet the requirements laid out in [StoneBraker81]
_Operating System Support for Database Management_
> bq.  The service required from an OS buffer manager is a selectedforce out which would
push the intentions list and the commit flag to disk in the proper order. Such a service is
not present in any buffer manager known to us.
> It's an expensive operation -so expensive that {{Syncable.hsync()}} isn't even called
on {{DFSOutputStream.close()}}. I
> Even though S3A does not manifest any data until close() is called, applications coming
from HDFS may call Syncable methods and expect to them to persist data with the durability
guarantees offered by HDFS.
> Since the output stream hardening of HADOOP-13327, S3A throws UnsupportedOperationException
to indicate that the synchronization semantics of Syncable absolutely cannot be met. 
> As a result, applications which have been calling the Syncable APIs are finding the call
failing. In the absence of exception handling to recognise that the durability semantics are
being met, they fail.
> If the user and the application actually expects data to be persisted, this is the correct
behaviour. The data cannot be persisted this way.
> If, however, they were calling this on HDFS more as a {{flush()}} than the full and expensive
DBMS-class persistence call, then this failure is unwelcome. The applications really needs
to catch the UnsupportedOperationException raised by S3A _or any other FS strictly reporting
failures_, report the problem and perform some other means of safe data storage
> Even better, they can use hasPathCapability on the FS or hasCapability() on the stream
to probe before even opening a file or trying to sync it. the hasCapability() on a stream
was actually implemented in Hadooop-2.x precisely to allow applications to identify when a
stream could not meet the guarantees (e.g some of the encrypted streams, file:// before HADOOP-13...)
> Until they can correct their code, I propose adding the option for s3a to downgrade
> fs.s3a.downgrade.syncable.exceptions 
> This will
> * Log once per process at WARN
> * downgrade the calls to noop() 
> * increment counters in S3A stats and IO stats of invocations of the Syncable methods.
This will allow for stats gathering to let us identify which applications need fixing in cloud
deployments
> Testing: copy the hsync tests but expect exceptions to be swallowed and stats to be collected
> Also: UnsupportedException text will link to this JIRA



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message