flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7689) Instrument the Flink JDBC sink
Date Tue, 26 Sep 2017 08:31:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180462#comment-16180462

ASF GitHub Bot commented on FLINK-7689:

GitHub user asicoe opened a pull request:


    [FLINK-7689] [Streaming Connectors] Added metrics to JDBCOutputFormat in order to be able
to m…

    Added metrics to JDBCOutputFormat in order to be able to measure jdbc batch flush rate,
latency and size.
    ## Contribution Checklist
      - Make sure that the pull request corresponds to a [JIRA issue](https://issues.apache.org/jira/projects/FLINK/issues).
Exceptions are made for typos in JavaDoc or documentation files, which need no JIRA issue.
      - Name the pull request in the form "[FLINK-XXXX] [component] Title of the pull request",
where *FLINK-XXXX* should be replaced by the actual issue number. Skip *component* if you
are unsure about which is the best component.
      Typo fixes that have no associated JIRA issue should be named following this pattern:
`[hotfix] [docs] Fix typo in event time introduction` or `[hotfix] [javadocs] Expand JavaDoc
for PuncuatedWatermarkGenerator`.
      - Fill out the template below to describe the changes contributed by the pull request.
That will give reviewers the context they need to do the review.
      - Make sure that the change passes the automated tests, i.e., `mvn clean verify` passes.
You can set up Travis CI to do that following [this guide](http://flink.apache.org/contribute-code.html#best-practices).
      - Each pull request should address only one issue, not mix up code from multiple issues.
      - Each commit in the pull request has a meaningful commit message (including the JIRA
      - Once all items of the checklist are addressed, remove the above text and this checklist,
leaving only the filled out template below.
    **(The sections below can be removed for hotfixes of typos)**
    ## What is the purpose of the change
    This pull request adds some useful metrics to the flink-jdbc sink.
    ## Brief change log
    - Added metrics to JDBCOutputFormat in order to be able to measure jdbc batch flush rate,
latency and size.
    ## Verifying this change
    This change added tests and can be verified as follows:
    Check the Task Metrics section of the Flink UI of a job that uses the flink-jdbc and flink-table
1.4-SNAPSHOT dependencies and creates a JDBCAppendTableSink instance and uses it's emitDataStream
method to write out a stream of messages to a JDBC enabled destination.
    These metrics have been successfully used in a dev environment running a Flink streaming
app for the past 1 month, in conjunction with statsd/graphite/graphana.
    ## Does this pull request potentially affect one of the following parts:
      - Dependencies (does it add or upgrade a dependency): no
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no
      - The serializers: no
      - The runtime per-record code paths (performance sensitive): no
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing,
Yarn/Mesos, ZooKeeper: no
    ## Documentation
      - Does this pull request introduce a new feature? no
      - If yes, how is the feature documented? not applicable

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/asicoe/flink FLINK-7689_instrument_jdbc_sink

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4725
commit c5baa1d4fb8947383c601817f84dc2e2cfb7f26a
Author: Alex Sicoe <alex.sicoe@intechww.com>
Date:   2017-09-26T08:01:40Z

    FLINK-7689 Added metrics to JDBCOutputFormat in order to be able to measure jdbc batch
flush rate, latency and size.


> Instrument the Flink JDBC sink
> ------------------------------
>                 Key: FLINK-7689
>                 URL: https://issues.apache.org/jira/browse/FLINK-7689
>             Project: Flink
>          Issue Type: Improvement
>          Components: Streaming Connectors
>    Affects Versions: 1.4.0
>            Reporter: Martin Eden
>            Priority: Minor
>              Labels: jdbc, metrics
>             Fix For: 1.4.0
>   Original Estimate: 24h
>  Remaining Estimate: 24h
> As confirmed by the Flink community in the following mailing list [message|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/metrics-for-Flink-sinks-td15200.html]
 using off the shelf Flink sinks like the JDBC sink, Redis sink or Cassandra sink etc does
not expose any sink specific metrics.
> The purpose of this ticket is to add some relevant metrics to the JDBCOutputFormat:
> - Meters for when a flush is made.
> - Histograms for the jdbc batch count and batch execution latency.
> These would allow deeper understanding of the runtime behaviour of performance critical
jobs writing to external databases using this generic interface.

This message was sent by Atlassian JIRA

View raw message