beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amit Sela (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-735) PAssertStreaming should make sure the assertion happened.
Date Sun, 09 Oct 2016 23:10:20 GMT

    [ https://issues.apache.org/jira/browse/BEAM-735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15560797#comment-15560797
] 

Amit Sela commented on BEAM-735:
--------------------------------

I took a look at {{PAssert}} and it seems that there's nothing I can use as it is right now.
In the current state of the runner, streaming support is very limited, and while I hope we're
getting there soon and we'll be able to rely on ROS tests (as we plan to do with batch now),
I opened this ticket because the runner's test suite wasn't reliable for streaming as it was
and it seemed urgent enough.
However, I'd be happy to hear of how we can make this as "Beamish" as possible even in this
current state.
Thanks!

> PAssertStreaming should make sure the assertion happened.
> ---------------------------------------------------------
>
>                 Key: BEAM-735
>                 URL: https://issues.apache.org/jira/browse/BEAM-735
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>            Reporter: Amit Sela
>            Assignee: Amit Sela
>
> The Spark runner currently runs PAsserts via `PAssertStreaming` which groups into a single
key and asserts the values on the worker (part of the "Lambda" in the Spark lingo).
> This could be a problem since Spark won't run if there's nothing to process - so that
if for some reason the input is missed, say reading from Kafka latest or simply an empty topic,
the assertion will be skipped and so we'll never fail (we would like to fail if there was
no input, while we expected one).
> This might change once Spark provide a better support for the Beam model in streaming,
but until then, it's best that our tests will consider this case as well.
> I'll add an aggregator and increment for assertion, at the end I'll make sure the aggregator
is not 0, so that at least one assertion took place (if for some reason Spark kept on for
a couple of more intervals it might execute the same assertion more then once, if the input
is repeated).  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message