I am not an official answer person, but IMO, the first question is:  “Is the source for TestSerDe.jar ‘open source’ under an ALv2-compatible license?”.


If “yes”, then supply the source in the source release and not the JAR.  One of the reasons for “no compiled code in a source release” is that it is very difficult to verify that compiled code is “correct” and not corrupted, infected with a virus, etc.


If “no”, then treat as a 3rd-party dependency.  Which may mean you can’t use it or need to treat it as optional, or a runtime dependency.


The related question is:  How do folks modify this JAR?  If it was a JPEG, there are plenty of JPEG modification tools.  There really aren’t JAR modification tools that modify JARs internal .class files, you really should use the source files.  I am still surprised/puzzled by the answer in the thread you linked to.  It still seems in both cases that a “binary” is being supplied for “convenience”.  IMO, there should be very few, if any, things in an Apache source repo that are “unmodifiable”.


The “workaround” of renaming the .jar or .class files to something else so it isn’t seen as executable code seems like it still doesn’t fully meet the spirit of an open source release, either, but better than shipping executable code in a source package.


On the other hand, I would not hold up a release for an issue like this.  Fix it in some future release.


My 2 cents,



From: Sean Owen <srowen@apache.org>
Reply-To: "legal-discuss@apache.org" <legal-discuss@apache.org>
Date: Monday, June 25, 2018 at 7:34 AM
To: "legal-discuss@apache.org" <legal-discuss@apache.org>
Cc: "justin@classsoftware.com" <justin@classsoftware.com>, "dev@spark.apache.org" <dev@spark.apache.org>
Subject: Re: LICENSE and NOTICE file content


@legal-discuss, brief recap:


In Spark's test source code and release, there are some JAR files which exist to test handling of JAR files. Example: TestSerDe.jar in https://github.com/apache/spark/tree/master/sql/hive/src/test/resources/data/files 


Justin raises the legitimate question: these don't belong in a source release, do they?


My operating theory had been that they are more like binary blobs w.r.t. Spark, like a test JPEG or data file, and are not the compiled version of any test code in Spark. They need to exist in order to run the tests from a source release. So it's not quite a case of shipping compiled Spark code in a source release.


I can imagine three opinions:


1) It's OK.

2) It's OK, but you need to include the source code to even those test JAR files somewhere

3) It's not fine, and the toolchain has to separately build these from source first automatically


I found https://markmail.org/thread/nf3lsdy5m3c3ovbr on legal-discuss previously, which seems to incline towards 2.


I'm also inclined towards 2, as 3 is probably relatively tricky in practice even though that's a nice-to-have.


I'd welcome opinions on this one.





On Sat, Jun 23, 2018 at 7:34 PM Justin Mclean <justin@classsoftware.com> wrote:

> It's not test code; test code would indeed have to be distributed as source as well. They are binary blobs, if you like, needed by test code, that happen to be JARs here and not JPEGs or .docx files or something. These help test handling of JAR files.

Which IMO is still not allowed in a source release, but as I said it would be best for you to check on legal discuss.