flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aljoscha <...@git.apache.org>
Subject [GitHub] flink pull request #3191: [FLINK-5529] [FLINK-4752] [docs] Improve / extends...
Date Wed, 25 Jan 2017 17:32:13 GMT
Github user aljoscha commented on a diff in the pull request:

    --- Diff: docs/dev/windows.md ---
    @@ -622,133 +690,138 @@ input
    -## Dealing with Late Data
    +## Triggers
    -When working with event-time windowing it can happen that elements arrive late, i.e the
    -watermark that Flink uses to keep track of the progress of event-time is already past
    -end timestamp of a window to which an element belongs. Please
    -see [event time](./event_time.html) and especially
    -[late elements](./event_time.html#late-elements) for a more thorough discussion of
    -how Flink deals with event time.
    +A `Trigger` determines when a window (as formed by the `WindowAssigner`) is ready to
    +processed by the *window function*. Each `WindowAssigner` comes with a default `Trigger`.

    +If the default trigger does not fit your needs, you can specify a custom trigger using
    -You can specify how a windowed transformation should deal with late elements and how
much lateness
    -is allowed. The parameter for this is called *allowed lateness*. This specifies by how
much time
    -elements can be late. Elements that arrive within the allowed lateness are still put
into windows
    -and are considered when computing window results. If elements arrive after the allowed
lateness they
    -will be dropped. Flink will also make sure that any state held by the windowing operation
is garbage
    -collected once the watermark passes the end of a window plus the allowed lateness.
    +The trigger interface provides five methods that react to different events: 
    -<span class="label label-info">Default</span> By default, the allowed lateness
is set to
    -`0`. That is, elements that arrive behind the watermark will be dropped.
    +* The `onElement()` method is called for each element that is added to a window. 
    +* The `onEventTime()` method is called when  a registered event-time timer fires. 
    +* The `onProcessingTime()` method is called when a registered processing-time timer fires.

    +* The `onMerge()` method is relevant for stateful triggers and merges the states of two
triggers when their corresponding windows merge, *e.g.* when using session windows. 
    +* Finally the `clear()` method performs any action needed upon removal of the corresponding
    -You can specify an allowed lateness like this:
    +Any of these methods can be used to register processing- or event-time timers for future
    -<div class="codetabs" markdown="1">
    -<div data-lang="java" markdown="1">
    -{% highlight java %}
    -DataStream<T> input = ...;
    +### Fire and Purge
    -    .keyBy(<key selector>)
    -    .window(<window assigner>)
    -    .allowedLateness(<time>)
    -    .<windowed transformation>(<window function>);
    -{% endhighlight %}
    +Once a trigger determines that a window is ready for processing, it fires. This is the
signal for the window operator to emit the result of the current window. Given a window with
a `WindowFunction` 
    +all elements are passed to the `WindowFunction` (possibly after passing them to an evictor).

    +Windows with `ReduceFunction` of `FoldFunction` simply emit their eagerly aggregated
    -<div data-lang="scala" markdown="1">
    -{% highlight scala %}
    -val input: DataStream[T] = ...
    +When a trigger fires, it can either `FIRE` or `FIRE_AND_PURGE`. While `FIRE` keeps the
contents of the window, `FIRE_AND_PURGE` removes its content.
    +By default, the pre-implemented triggers simply `FIRE` without purging the window state.
    -    .keyBy(<key selector>)
    -    .window(<window assigner>)
    -    .allowedLateness(<time>)
    -    .<windowed transformation>(<window function>)
    -{% endhighlight %}
    +<span class="label label-danger">Attention</span> When purging, only the
contents of the window are cleared. The window itself is not removed and accepts new elements.
    --- End diff --
    This is a bit tricky because for non-merging windows there is nothing that could be removed
except the elements. Maybe write that PURGING will simply remove the contents of the window
and will leave any eventual meta information intact and will also leave the Trigger state

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message