spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyukjin Kwon <>
Subject Re: Adding JIRA ID as the prefix for the test case name
Date Tue, 19 Nov 2019 00:22:29 GMT
Let me document as below in few days:

1. For Python and Java, write a single comment that starts with JIRA ID and
short description, e.g. (SPARK-XXXXX: test blah blah)
2. For R, use JIRA ID as a prefix for its test name.

assuming everybody is happy.

2019년 11월 18일 (월) 오전 11:36, Hyukjin Kwon <>님이 작성:

> Actually there are not so many Java test cases in Spark (because Scala
> runs on JVM as everybody knows)[1].
> Given that, I think we can avoid to put some efforts on this for now .. I
> don't mind if somebody wants to give a shot since it looks good anyway but
> to me I wouldn't spend so much time on this ..
> Let me just go ahead as I suggested if you don't mind. Anyone can give a
> shot for Display Name - I'm willing to actively review and help.
> [1]
> git ls-files '*' | wc -l
>      172
> git ls-files '*Suite.scala' | wc -l
>     1161
> 2019년 11월 18일 (월) 오전 3:27, Steve Loughran <>님이
>> Test reporters do often contain some assumptions about the characters in
>> the test methods. Historically JUnit XML reporters have never sanitised the
>> method names so XML injection attacks have been fairly trivial. Haven't
>> tried this for a while.
>> That whole JUnit XML report "standard" was actually put together in the
>> Ant project with <Junitreport> doing the postprocessing of the JUnit run.
>> It was driven by the team's XSL skills than any overreaching strategic goal
>> about how to present test results of tests which could run for hours and
>> whose output you would really want to aggregate the locks from multiple
>> machines and processes and present in awake you can actually navigate. With
>> hindsight, a key failing is that we chose to store the test summaries (test
>> count, failure count...) as attributes on the root XML mode. Which is why
>> the whole DOM gets built up in the JUnit runner. Which is why when that
>> JUnit process crashes, you get no report at all.
>> It'd be straightforward to fix -except too much relies on that file
>> now...important things will break. And the maven runner has historically
>> never supported custom reporters, to let you experiment with it.
>> Maybe this is an opportunity to change things.
>> On Sun, Nov 17, 2019 at 1:42 AM Hyukjin Kwon <> wrote:
>>> DisplayName looks good in general but actually here I would like first
>>> to find a existing pattern to document in guidelines given the actual
>>> existing practice we all are used to. I'm trying to be very conservative
>>> since this guidelines affect everybody.
>>> I think it might be better to discuss separately if we want to change
>>> what we have been used to.
>>> Also, using arbitrary names might not be actually free due to such bug
>>> like . It will need some
>>> more efforts to investigate as well.
>>> On Fri, 15 Nov 2019, 20:56 Steve Loughran, <>
>>> wrote:
>>>>  Junit5: Display names.
>>>> Goes all the way to the XML.
>>>> On Thu, Nov 14, 2019 at 6:13 PM Shixiong(Ryan) Zhu <
>>>>> wrote:
>>>>> Should we also add a guideline for non Scala tests? Other languages
>>>>> (Java, Python, R) don't support using string as a test name.
>>>>> Best Regards,
>>>>> Ryan
>>>>> On Thu, Nov 14, 2019 at 4:04 AM Hyukjin Kwon <>
>>>>> wrote:
>>>>>> I opened a PR -
>>>>>> 2019년 11월 13일 (수) 오전 10:43, Hyukjin Kwon <>님이
>>>>>>> > In general a test should be self descriptive and I don't
think we
>>>>>>> should be adding JIRA ticket references wholesale. Any action
that the
>>>>>>> reader has to take to understand why a test was introduced is
one too many.
>>>>>>> However in some cases the thing we are trying to test is very
subtle and in
>>>>>>> that case a reference to a JIRA ticket might be useful, I do
still feel
>>>>>>> that this should be a backstop and that properly documenting
your tests is
>>>>>>> a much better way of dealing with this.
>>>>>>> Yeah, the test should be self-descriptive. I don't think adding
>>>>>>> JIRA prefix harms this point. Probably I should add this sentence
in the
>>>>>>> guidelines as well.
>>>>>>> Adding a JIRA prefix just adds one extra hint to track down details.
>>>>>>> I think it's fine to stick to this practice and make it simpler
and clear
>>>>>>> to follow.
>>>>>>> > 1. what if multiple JIRA IDs relating to the same test?
we just
>>>>>>> take the very first JIRA ID?
>>>>>>> Ideally one JIRA should describe one issue and one PR should
fix one
>>>>>>> JIRA with a dedicated test.
>>>>>>> Yeah, I think I would take the very first JIRA ID.
>>>>>>> > 2. are we going to have a full scan of all existing tests
>>>>>>> attach a JIRA ID to it?
>>>>>>> Yea, let's don't do this.
>>>>>>> > It's a nice-to-have, not super essential, just because ...
>>>>>>> It's been asked multiple times and each committer seems having
>>>>>>> different understanding on this.
>>>>>>> It's not a biggie but wanted to make it clear and conclude this.
>>>>>>> > I'd add this only when a test specifically targets a certain
>>>>>>> Yes, so this one I am not sure. From what I heard, people adds
>>>>>>> JIRA in cases below:
>>>>>>> - Whenever the JIRA type is a bug
>>>>>>> - When a PR adds a couple of tests
>>>>>>> - Only when a test specifically targets a certain issue.
>>>>>>> - ...
>>>>>>> Which one do we prefer and simpler to follow?
>>>>>>> Or I can combine as below (im gonna reword when I actually document
>>>>>>> this):
>>>>>>> 1. In general, we should add a JIRA ID as prefix of a test when
a PR
>>>>>>> targets to fix a specific issue.
>>>>>>>     In practice, it usually happens when a JIRA type is a bug
or a
>>>>>>> PR adds a couple of tests.
>>>>>>> 2. Uses "SPARK-XXXX: test name" format
>>>>>>> If we have no objection with ^, let me go with this.
>>>>>>> 2019년 11월 13일 (수) 오전 8:14, Sean Owen <>님이
>>>>>>>> Let's suggest "SPARK-12345:" but not go back and change a
bunch of
>>>>>>>> test cases.
>>>>>>>> I'd add this only when a test specifically targets a certain
>>>>>>>> It's a nice-to-have, not super essential, just because in
the rare
>>>>>>>> case you need to understand why a test asserts something,
you can go
>>>>>>>> back and find what added it in the git history without much
>>>>>>>> On Mon, Nov 11, 2019 at 10:46 AM Hyukjin Kwon <>
>>>>>>>> wrote:
>>>>>>>> >
>>>>>>>> > Hi all,
>>>>>>>> >
>>>>>>>> > Maybe it's not a big deal but it brought some confusions
time to
>>>>>>>> time into Spark dev and community. I think it's time to discuss
>>>>>>>> when/which format to add a JIRA ID as a prefix for the test
case name in
>>>>>>>> Scala test cases.
>>>>>>>> >
>>>>>>>> > Currently we have many test case names with prefixes
as below:
>>>>>>>> >
>>>>>>>> > test("SPARK-XXXXX blah blah")
>>>>>>>> > test("SPARK-XXXXX: blah blah")
>>>>>>>> > test("SPARK-XXXXX - blah blah")
>>>>>>>> > test("[SPARK-XXXXX] blah blah")
>>>>>>>> > …
>>>>>>>> >
>>>>>>>> > It is a good practice to have the JIRA ID in general
because, for
>>>>>>>> instance,
>>>>>>>> > it makes us put less efforts to track commit histories
(or even
>>>>>>>> when the files
>>>>>>>> > are totally moved), or to track related information
of tests
>>>>>>>> failed.
>>>>>>>> > Considering Spark's getting big, I think it's good to
>>>>>>>> >
>>>>>>>> > I would like to suggest this and document it in our
>>>>>>>> >
>>>>>>>> > 1. Add a prefix into a test name when a PR adds a couple
of tests.
>>>>>>>> > 2. Uses "SPARK-XXXX: test name" format which is used
in our code
>>>>>>>> base most
>>>>>>>> >       often[1].
>>>>>>>> >
>>>>>>>> > We should make it simple and clear but closer to the
>>>>>>>> practice. So, I would like to listen to what other people
think. I would
>>>>>>>> appreciate if you guys give some feedback about when to add
the JIRA
>>>>>>>> prefix. One alternative is that, we only add the prefix when
the JIRA's
>>>>>>>> type is bug.
>>>>>>>> >
>>>>>>>> > [1]
>>>>>>>> > git grep -E 'test\("\SPARK-([0-9]+):' | wc -l
>>>>>>>> >      923
>>>>>>>> > git grep -E 'test\("\SPARK-([0-9]+) ' | wc -l
>>>>>>>> >      477
>>>>>>>> > git grep -E 'test\("\[SPARK-([0-9]+)\]' | wc -l
>>>>>>>> >       16
>>>>>>>> > git grep -E 'test\("\SPARK-([0-9]+) -' | wc -l
>>>>>>>> >       13
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >

View raw message