spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Imran Rashid <iras...@cloudera.com.INVALID>
Subject Re: python test infrastructure
Date Thu, 06 Sep 2018 15:41:40 GMT
>
On Wed, Sep 5, 2018 at 11:59 PM Hyukjin Kwon <gurwls223@gmail.com> wrote:

> >

> > > 1. all of the output in target/test-reports & python/unit-tests.log
should be included in the jenkins archived artifacts.

> >

> > Hmmm, I thought they are already archived (
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95734/artifact/target/unit-tests.log
).

> > FWIW, unit-tests.log are pretty messy and they are shown when specific
tests are borken currently.

ah I guess I was looking in the wrong place for the unit-tests.log.  Agree
its messy, could we do something like adding the headers in SparkFunSuite?

And there is still the target/test-reports output that is not getting
archived.

> > 2. That test output needs to be separated by python executable.  It
seems to me that right now if you run python/run-tests with multiple
python-executables, you get separate test output (because each output file
includes a timestamp), but you can't tell which python version was used.

> >

> > It wouldn't be difficult. I can make the changes if they are necessary;
however, I still think it's rather minor since logs are shown when some
tests are broken.

I think its useful even when things are successful.  I use builds on
jenkins all the time so I can compare my runs with a known successful run.
It would be great if on jenkins I could find test-reports for the exact
python version I am testing against locally.

> > 3. the test output should be incorporated into jenkins test output, so
its easier to see which test is failing, which tests are run, test trends,
etc.  Along with the above, that means the tests should be prefixed (or
something) with the python executable in the reports so you can track test
results for each executable.  (it seems this was done at one point by
SPARK-11295, but for whatever reason, doesn't seem to work anymore.)

> >

> > Yea, I have taken a look for organising logs stuff before (for instance
https://github.com/apache/spark/pull/21107) but not for this idea itself. I
agree with this idea in general.

https://issues.apache.org/jira/browse/SPARK-25359


> 2018년 9월 6일 (목) 오전 5:41, Imran Rashid <irashid@cloudera.com.invalid>님이
작성:
>
>> one more: seems like python/run-tests should have an option at least to
>> not bail at the first failure:
>> https://github.com/apache/spark/blob/master/python/run-tests.py#L113-L132
>>
>> this is particularly annoying with flaky tests -- since the rest of the
>> tests aren't run, you don't know whether you *only* had a failure in that
>> flaky test, or if there was some other real failure as well.
>>
>> On Wed, Sep 5, 2018 at 1:31 PM Imran Rashid <irashid@cloudera.com> wrote:
>>
>>> Hi all,
>>>
>>> More pyspark noob questions from me.  I find it really hard to figure
>>> out what versions of python I should be testing and what is tested
>>> upstream.  While I'd like to just know the answers to those questions, more
>>> importantly I'd like to make sure that info is visible somewhere so all
>>> devs can figure it out themselves.  I think we should have:
>>>
>>> 1. all of the output in target/test-reports & python/unit-tests.log
>>> should be included in the jenkins archived artifacts.
>>>
>>> 2. That test output needs to be separated by python executable.  It
>>> seems to me that right now if you run python/run-tests with multiple
>>> python-executables, you get separate test output (because each output file
>>> includes a timestamp), but you can't tell which python version was used.
>>>
>>> 3. the test output should be incorporated into jenkins test output, so
>>> its easier to see which test is failing, which tests are run, test trends,
>>> etc.  Along with the above, that means the tests should be prefixed (or
>>> something) with the python executable in the reports so you can track test
>>> results for each executable.  (it seems this was done at one point by
>>> SPARK-11295, but for whatever reason, doesn't seem to work anymore.)
>>>
>>> if we had these features as part of the regular testing infrastructure,
>>> I think it would make it easier for everyone to understand what was
>>> happening in the current pyspark tests and to compare their own local tests
>>> with them.
>>>
>>> thoughts?  is this covered somewhere that I don't know about?
>>>
>>> thanks,
>>> Imran
>>>
>>

Mime
View raw message