beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <>
Subject [GitHub] [beam] KevinGG commented on issue #11174: [BEAM-7923] Pop failed transform when error is raised
Date Mon, 23 Mar 2020 20:57:55 GMT
KevinGG commented on issue #11174: [BEAM-7923] Pop failed transform when error is raised
   Fixed a failed test.
   TL;DR, the test was wrong and passed in the past because it got lucky.
   The test used to pass and started failing with the try-append-finally-pop change, because:
   1. The test tried to append two no name / label transforms into a same pipeline, both of
them will generate the same label and raise errors while the tests asserted for errors, thus
allowing pipeline construction to continue even if it ran into fatal errors;
   2. In the past, the pipeline was in a broken state where the current_transform was not
popped when the 1st error was raised. The full label generated was `WriteToBigQuery`;
   3. In the past, the second transform appending wrongly added parts into the broken current_transform
`WriteToBigQuery`, causing the transform node to be appended after a wrong failed parent node
and luckily got a unique transform full label `WriteToBigQuery/WriteToBigQuery`, thus the
test didn't fail due to duplicated labels. But the pipeline was constructed as `WriteToBigQuery`->`WriteToBigQuery`
which was completely messed up;
   4. Still the test didn't fail because it was testing for errors even though it built a
broken pipeline with broken usages.
   Once the try-append-finally-pop change is applied, the test functions as:
   1. First transform still fails to be appended and has side-effects including an applied
transform label, but it would not leave the pipeline in a broken state where current transform
is still the right node;
   2. Second transform still fails but now appended to the correct node;
   3. As long as the 2 transforms have different labels (or executed in different cells if
in an interactive environment), the test still passes, pipeline is not broken and can be used
for future development, side-effects are ruled out when data-centric APIs such as `show` and
`collect` are invoked.
   The reason we keep the applied label is that we never know what side effects are when the
error is raised.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

View raw message