More pyspark noob questions from me. I find it really hard to figure out what versions of python I should be testing and what is tested upstream. While I'd like to just know the answers to those questions, more importantly I'd like to make sure that info is visible somewhere so all devs can figure it out themselves. I think we should have:
1. all of the output in target/test-reports & python/unit-tests.log should be included in the jenkins archived artifacts.
2. That test output needs to be separated by python executable. It seems to me that right now if you run python/run-tests with multiple python-executables, you get separate test output (because each output file includes a timestamp), but you can't tell which python version was used.
3. the test output should be incorporated into jenkins test output, so its easier to see which test is failing, which tests are run, test trends, etc. Along with the above, that means the tests should be prefixed (or something) with the python executable in the reports so you can track test results for each executable. (it seems this was done at one point by SPARK-11295, but for whatever reason, doesn't seem to work anymore.)
if we had these features as part of the regular testing infrastructure, I think it would make it easier for everyone to understand what was happening in the current pyspark tests and to compare their own local tests with them.
thoughts? is this covered somewhere that I don't know about?