spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiao Li <gatorsm...@gmail.com>
Subject Re: [VOTE] Release Spark 3.1.1 (RC3)
Date Fri, 26 Feb 2021 07:30:31 GMT
I confirmed that Q17 and Q39a/b have matching results between Spark 3.0 and
3.1 after enabling spark.sql.legacy.statisticalAggregate. The result
changes are expected. For more details, you can read the PR
https://github.com/apache/spark/pull/29983/ Also, the result of Q18 is
affected by the overflow checking in Spark. These issues exist in all the
releases. We will continue to improve our ANSI mode and fix them in the
upcoming releases.

Thus, I change my vote from -1 to +1.

As Ismael suggested, we can add some Github Actions to validate the TPC-DS
and TPC-H results for small scale datasets.

Cheers,

Xiao



Ismaël Mejía <iemejia@gmail.com> 于2021年2月25日周四 下午12:16写道:

> Since the TPC-DS performance tests are one of the main validation sources
> for regressions on Spark releases maybe it is time to automate the query
> outputs validation to find correctness issues eagerly (it would be also
> nice to validate the performance regressions but correctness >>>
> performance).
>
> This has been a long standing open issue [1] that is probably worth to
> address and it seems that automating this via Github Actions could be
> relatively straight-forward.
>
> [1] https://github.com/databricks/spark-sql-perf/issues/184
>
>
> On Wed, Feb 24, 2021 at 8:15 PM Reynold Xin <rxin@databricks.com> wrote:
>
>> +1 Correctness issues are serious!
>>
>>
>> On Wed, Feb 24, 2021 at 11:08 AM, Mridul Muralidharan <mridul@gmail.com>
>> wrote:
>>
>>> That is indeed cause for concern.
>>> +1 on extending the voting deadline until we finish investigation of
>>> this.
>>>
>>> Regards,
>>> Mridul
>>>
>>>
>>> On Wed, Feb 24, 2021 at 12:55 PM Xiao Li <gatorsmile@gmail.com> wrote:
>>>
>>>> -1 Could we extend the voting deadline?
>>>>
>>>> A few TPC-DS queries (q17, q18, q39a, q39b) are returning different
>>>> results between Spark 3.0 and Spark 3.1. We need a few more days to
>>>> understand whether these changes are expected.
>>>>
>>>> Xiao
>>>>
>>>>
>>>> Mridul Muralidharan <mridul@gmail.com> 于2021年2月24日周三 上午10:41写道:
>>>>
>>>>>
>>>>> Sounds good, thanks for clarifying Hyukjin !
>>>>> +1 on release.
>>>>>
>>>>> Regards,
>>>>> Mridul
>>>>>
>>>>>
>>>>> On Wed, Feb 24, 2021 at 2:46 AM Hyukjin Kwon <gurwls223@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I remember HiveExternalCatalogVersionsSuite was flaky for a while
>>>>>> which is fixed in
>>>>>> https://github.com/apache/spark/commit/0d5d248bdc4cdc71627162a3d20c42ad19f24ef4
>>>>>> and .. KafkaDelegationTokenSuite is flaky (
>>>>>> https://issues.apache.org/jira/browse/SPARK-31250).
>>>>>>
>>>>>> 2021년 2월 24일 (수) 오후 5:19, Mridul Muralidharan <mridul@gmail.com>님이
>>>>>> 작성:
>>>>>>
>>>>>>>
>>>>>>> Signatures, digests, etc check out fine.
>>>>>>> Checked out tag and build/tested with -Pyarn -Phadoop-2.7 -Phive
>>>>>>> -Phive-thriftserver -Pmesos -Pkubernetes
>>>>>>>
>>>>>>> I keep getting test failures with
>>>>>>> * org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite
>>>>>>> * org.apache.spark.sql.kafka010.KafkaDelegationTokenSuite.
>>>>>>> (Note: I remove $HOME/.m2 and $HOME/.iv2 paths before build)
>>>>>>>
>>>>>>> Removing these suites gets the build through though - does anyone
>>>>>>> have suggestions on how to fix it ? I did not face this with
RC1.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Mridul
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 22, 2021 at 12:57 AM Hyukjin Kwon <gurwls223@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Please vote on releasing the following candidate as Apache
Spark
>>>>>>>> version 3.1.1.
>>>>>>>>
>>>>>>>> The vote is open until February 24th 11PM PST and passes
if a
>>>>>>>> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>>>>>>>
>>>>>>>> [ ] +1 Release this package as Apache Spark 3.1.1
>>>>>>>> [ ] -1 Do not release this package because ...
>>>>>>>>
>>>>>>>> To learn more about Apache Spark, please see
>>>>>>>> http://spark.apache.org/
>>>>>>>>
>>>>>>>> The tag to be voted on is v3.1.1-rc3 (commit
>>>>>>>> 1d550c4e90275ab418b9161925049239227f3dc9):
>>>>>>>> https://github.com/apache/spark/tree/v3.1.1-rc3
>>>>>>>>
>>>>>>>> The release files, including signatures, digests, etc. can
be found
>>>>>>>> at:
>>>>>>>> <https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-bin/>
>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-bin/
>>>>>>>>
>>>>>>>> Signatures used for Spark RCs can be found in this file:
>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>>>>>>
>>>>>>>> The staging repository for this release can be found at:
>>>>>>>>
>>>>>>>> https://repository.apache.org/content/repositories/orgapachespark-1367
>>>>>>>>
>>>>>>>> The documentation corresponding to this release can be found
at:
>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-docs/
>>>>>>>>
>>>>>>>> The list of bug fixes going into 3.1.1 can be found at the
>>>>>>>> following URL:
>>>>>>>> https://s.apache.org/41kf2
>>>>>>>>
>>>>>>>> This release is using the release script of the tag v3.1.1-rc3.
>>>>>>>>
>>>>>>>> FAQ
>>>>>>>>
>>>>>>>> ===================
>>>>>>>> What happened to 3.1.0?
>>>>>>>> ===================
>>>>>>>>
>>>>>>>> There was a technical issue during Apache Spark 3.1.0 preparation,
>>>>>>>> and it was discussed and decided to skip 3.1.0.
>>>>>>>> Please see
>>>>>>>> https://spark.apache.org/news/next-official-release-spark-3.1.1.html
for
>>>>>>>> more details.
>>>>>>>>
>>>>>>>> =========================
>>>>>>>> How can I help test this release?
>>>>>>>> =========================
>>>>>>>>
>>>>>>>> If you are a Spark user, you can help us test this release
by taking
>>>>>>>> an existing Spark workload and running on this release candidate,
>>>>>>>> then
>>>>>>>> reporting any regressions.
>>>>>>>>
>>>>>>>> If you're working in PySpark you can set up a virtual env
and
>>>>>>>> install
>>>>>>>> the current RC via "pip install
>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-bin/pyspark-3.1.1.tar.gz
>>>>>>>> "
>>>>>>>> and see if anything important breaks.
>>>>>>>> In the Java/Scala, you can add the staging repository to
your
>>>>>>>> projects resolvers and test
>>>>>>>> with the RC (make sure to clean up the artifact cache before/after
>>>>>>>> so
>>>>>>>> you don't end up building with an out of date RC going forward).
>>>>>>>>
>>>>>>>> ===========================================
>>>>>>>> What should happen to JIRA tickets still targeting 3.1.1?
>>>>>>>> ===========================================
>>>>>>>>
>>>>>>>> The current list of open tickets targeted at 3.1.1 can be
found at:
>>>>>>>> https://issues.apache.org/jira/projects/SPARK and search
for
>>>>>>>> "Target Version/s" = 3.1.1
>>>>>>>>
>>>>>>>> Committers should look at those and triage. Extremely important
bug
>>>>>>>> fixes, documentation, and API tweaks that impact compatibility
>>>>>>>> should
>>>>>>>> be worked on immediately. Everything else please retarget
to an
>>>>>>>> appropriate release.
>>>>>>>>
>>>>>>>> ==================
>>>>>>>> But my bug isn't fixed?
>>>>>>>> ==================
>>>>>>>>
>>>>>>>> In order to make timely releases, we will typically not hold
the
>>>>>>>> release unless the bug in question is a regression from the
previous
>>>>>>>> release. That being said, if there is something which is
a
>>>>>>>> regression
>>>>>>>> that has not been correctly targeted please ping me or a
committer
>>>>>>>> to
>>>>>>>> help target the issue.
>>>>>>>>
>>>>>>>
>>

Mime
View raw message