beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amit Sela (JIRA)" <j...@apache.org>
Subject [jira] [Created] (BEAM-735) PAssertStreaming should make sure the assertion happened.
Date Sun, 09 Oct 2016 14:25:20 GMT
Amit Sela created BEAM-735:
------------------------------

             Summary: PAssertStreaming should make sure the assertion happened.
                 Key: BEAM-735
                 URL: https://issues.apache.org/jira/browse/BEAM-735
             Project: Beam
          Issue Type: Bug
          Components: runner-spark
            Reporter: Amit Sela
            Assignee: Amit Sela


The Spark runner currently runs PAsserts via `PAssertStreaming` which groups into a single
key and asserts the values on the worker (part of the "Lambda" in the Spark lingo).
This could be a problem since Spark won't run if there's nothing to process - so that if for
some reason the input is missed, say reading from Kafka latest or simply an empty topic, the
assertion will be skipped and so we'll never fail (we would like to fail if there was no input,
while we expected one).

This might change once Spark provide a better support for the Beam model in streaming, but
until then, it's best that our tests will consider this case as well.

I'll add an aggregator and increment for assertion, at the end I'll make sure the aggregator
is not 0, so that at least one assertion took place (if for some reason Spark kept on for
a couple of more intervals it might execute the same assertion more then once, if the input
is repeated).  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message