spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <>
Subject Re: unit testing in spark
Date Tue, 11 Apr 2017 10:53:47 GMT

(sorry sent an empty reply by accident)

Unit testing is one of the easiest ways to isolate problems in an an internal class, things
you can get wrong. But: time spent writing unit tests is time *not* spent writing integration
tests. Which biases me towards the integration.

What I do find is good is writing integration tests to debug things: if something is playing
up, if you can write a unit test to replicate then not only can you isolate the problem, you
can verify it is fixed and stays fixed. And as they are fast & often runnable in parallel,
easy to do repetitively.

But: Tests have a maintenance cost, especially if the tests go into the internals, making
them very brittle to change. Mocking is the real troublespot here. It's good to be able to
simulate failures, but given the choice between "integration test against real code" and "something
using mocks which produce "impossible' stack traces and, after a code rework, fail so badly
you can't tell if it's a regression or just the tests are obsolete", I'd go for production,
even if runs up some bills.

I really liked Lar's slides; gave me some ideas. One thing I've been exploring is using system
metrics in testing, adding more metrics to help note what is happening

Strengths: encourages me to write metrics, can be used in in-VM tests, and collected from
a distributed SUT integration tests, both for asserts and logging. Weakness1. : exposing internal
state which, again, can be brittle. 2. in integration tests the results can vary a lot, so
you can't really make assertions on it. Better there to collect things and use in test reports.

Which brings me to a real issue with integration tests, which isn't a fault of the apps or
the tests, but in today's test runners: log capture and reporting dates from the era where
we were running unit tests, so thinking about the reporting problems there: standard out and
error for a single process, no standard log format so naive stream capture over structured
log entries; test runners which don't repot much on a failure but the stack trace, or, with
scalatest, half the stack trace (*), missing out on those of the remote systems. Systems which,
if you are playing with cloud infra, may not be there when you get to analyse the test results.
You are left trying to compare 9 logs across 3 destroyed VMs to work out why the test runner
through an assertion failure.

This is tractable, and indeed, the Kakfa people have been advocating "use kafka as the collector
of test results" to address it: the logs, metrics, events raised by the SUT., etc, and then
somehow correlate them into test reports, or at least provide the ordering of events and state
across parts of the system so that you can work back from a test failure. Yes, that means
moving way beyond the usual ant-JUnit XML report everything creates, but like I said: that
was written for a different era. It's time to move on, generating the XML report as one of
the outputs if you want, but not the one you use for diagnosing why a test fails.

I'd love to see what people have been up to in that area. If anyone has insights there, it'd
be topic for a hangout.


(*) Scaltest opinions:

View raw message