hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HADOOP-16830) Add public IOStatistics API
Date Thu, 01 Oct 2020 11:32:00 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=493435&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493435

ASF GitHub Bot logged work on HADOOP-16830:

                Author: ASF GitHub Bot
            Created on: 01/Oct/20 11:31
            Start Date: 01/Oct/20 11:31
    Worklog Time Spent: 10m 
      Work Description: mehakmeet commented on pull request #2323:
URL: https://github.com/apache/hadoop/pull/2323#issuecomment-702072245

   In IOStatisticsBinding class we have methods for tracking duration but, I am not able to
wrap it around a normal function.
   There are 3 methods for tracking durations which are for Callable<B>, CallableRaisingIOE<B>,
and FunctionRaisingIOE<A, B>. We should add 1 more for a normal function too.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:

Issue Time Tracking

    Worklog Id:     (was: 493435)
    Time Spent: 6.5h  (was: 6h 20m)

> Add public IOStatistics API
> ---------------------------
>                 Key: HADOOP-16830
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16830
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs, fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 6.5h
>  Remaining Estimate: 0h
> Applications like to collect the statistics which specific operations take, by collecting
exactly those operations done during the execution of FS API calls by their individual worker
threads, and returning these to their job driver
> * S3A has a statistics API for some streams, but it's a non-standard one; Impala &c
can't use it
> * FileSystem storage statistics are public, but as they aren't cross-thread, they don't
aggregate properly
> Proposed
> # A new IOStatistics interface to serve up statistics
> # S3A to implement
> # other stores to follow
> # Pass-through from the usual wrapper classes (FS data input/output streams)
> It's hard to think about how best to offer an API for operation context stats, and how
to actually implement.
> ThreadLocal isn't enough because the helper threads need to update on the thread local
value of the instigator
> My Initial PoC doesn't address that issue, but it shows what I'm thinking of

This message was sent by Atlassian Jira

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message