spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Imran Rashid <>
Subject python test infrastructure
Date Wed, 05 Sep 2018 18:31:59 GMT
Hi all,

More pyspark noob questions from me.  I find it really hard to figure out
what versions of python I should be testing and what is tested upstream.
While I'd like to just know the answers to those questions, more
importantly I'd like to make sure that info is visible somewhere so all
devs can figure it out themselves.  I think we should have:

1. all of the output in target/test-reports & python/unit-tests.log should
be included in the jenkins archived artifacts.

2. That test output needs to be separated by python executable.  It seems
to me that right now if you run python/run-tests with multiple
python-executables, you get separate test output (because each output file
includes a timestamp), but you can't tell which python version was used.

3. the test output should be incorporated into jenkins test output, so its
easier to see which test is failing, which tests are run, test trends,
etc.  Along with the above, that means the tests should be prefixed (or
something) with the python executable in the reports so you can track test
results for each executable.  (it seems this was done at one point by
SPARK-11295, but for whatever reason, doesn't seem to work anymore.)

if we had these features as part of the regular testing infrastructure, I
think it would make it easier for everyone to understand what was happening
in the current pyspark tests and to compare their own local tests with them.

thoughts?  is this covered somewhere that I don't know about?


View raw message