spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyukjin Kwon <gurwls...@gmail.com>
Subject Re: [VOTE] Release Spark 3.1.1 (RC3)
Date Fri, 26 Feb 2021 07:31:32 GMT
Thanks, Xiao. I will close this vote within a couple of hours.

2021년 2월 26일 (금) 오후 4:30, Xiao Li <gatorsmile@gmail.com>님이 작성:

> I confirmed that Q17 and Q39a/b have matching results between Spark 3.0
> and 3.1 after enabling spark.sql.legacy.statisticalAggregate. The result
> changes are expected. For more details, you can read the PR
> https://github.com/apache/spark/pull/29983/ Also, the result of Q18 is
> affected by the overflow checking in Spark. These issues exist in all the
> releases. We will continue to improve our ANSI mode and fix them in the
> upcoming releases.
>
> Thus, I change my vote from -1 to +1.
>
> As Ismael suggested, we can add some Github Actions to validate the TPC-DS
> and TPC-H results for small scale datasets.
>
> Cheers,
>
> Xiao
>
>
>
> Ismaël Mejía <iemejia@gmail.com> 于2021年2月25日周四 下午12:16写道:
>
>> Since the TPC-DS performance tests are one of the main validation sources
>> for regressions on Spark releases maybe it is time to automate the query
>> outputs validation to find correctness issues eagerly (it would be also
>> nice to validate the performance regressions but correctness >>>
>> performance).
>>
>> This has been a long standing open issue [1] that is probably worth to
>> address and it seems that automating this via Github Actions could be
>> relatively straight-forward.
>>
>> [1] https://github.com/databricks/spark-sql-perf/issues/184
>>
>>
>> On Wed, Feb 24, 2021 at 8:15 PM Reynold Xin <rxin@databricks.com> wrote:
>>
>>> +1 Correctness issues are serious!
>>>
>>>
>>> On Wed, Feb 24, 2021 at 11:08 AM, Mridul Muralidharan <mridul@gmail.com>
>>> wrote:
>>>
>>>> That is indeed cause for concern.
>>>> +1 on extending the voting deadline until we finish investigation of
>>>> this.
>>>>
>>>> Regards,
>>>> Mridul
>>>>
>>>>
>>>> On Wed, Feb 24, 2021 at 12:55 PM Xiao Li <gatorsmile@gmail.com> wrote:
>>>>
>>>>> -1 Could we extend the voting deadline?
>>>>>
>>>>> A few TPC-DS queries (q17, q18, q39a, q39b) are returning different
>>>>> results between Spark 3.0 and Spark 3.1. We need a few more days to
>>>>> understand whether these changes are expected.
>>>>>
>>>>> Xiao
>>>>>
>>>>>
>>>>> Mridul Muralidharan <mridul@gmail.com> 于2021年2月24日周三
上午10:41写道:
>>>>>
>>>>>>
>>>>>> Sounds good, thanks for clarifying Hyukjin !
>>>>>> +1 on release.
>>>>>>
>>>>>> Regards,
>>>>>> Mridul
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 24, 2021 at 2:46 AM Hyukjin Kwon <gurwls223@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I remember HiveExternalCatalogVersionsSuite was flaky for a while
>>>>>>> which is fixed in
>>>>>>> https://github.com/apache/spark/commit/0d5d248bdc4cdc71627162a3d20c42ad19f24ef4
>>>>>>> and .. KafkaDelegationTokenSuite is flaky (
>>>>>>> https://issues.apache.org/jira/browse/SPARK-31250).
>>>>>>>
>>>>>>> 2021년 2월 24일 (수) 오후 5:19, Mridul Muralidharan <mridul@gmail.com>님이
>>>>>>> 작성:
>>>>>>>
>>>>>>>>
>>>>>>>> Signatures, digests, etc check out fine.
>>>>>>>> Checked out tag and build/tested with -Pyarn -Phadoop-2.7
-Phive
>>>>>>>> -Phive-thriftserver -Pmesos -Pkubernetes
>>>>>>>>
>>>>>>>> I keep getting test failures with
>>>>>>>> * org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite
>>>>>>>> * org.apache.spark.sql.kafka010.KafkaDelegationTokenSuite.
>>>>>>>> (Note: I remove $HOME/.m2 and $HOME/.iv2 paths before build)
>>>>>>>>
>>>>>>>> Removing these suites gets the build through though - does
anyone
>>>>>>>> have suggestions on how to fix it ? I did not face this with
RC1.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Mridul
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Feb 22, 2021 at 12:57 AM Hyukjin Kwon <gurwls223@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Please vote on releasing the following candidate as Apache
Spark
>>>>>>>>> version 3.1.1.
>>>>>>>>>
>>>>>>>>> The vote is open until February 24th 11PM PST and passes
if a
>>>>>>>>> majority +1 PMC votes are cast, with a minimum of 3 +1
votes.
>>>>>>>>>
>>>>>>>>> [ ] +1 Release this package as Apache Spark 3.1.1
>>>>>>>>> [ ] -1 Do not release this package because ...
>>>>>>>>>
>>>>>>>>> To learn more about Apache Spark, please see
>>>>>>>>> http://spark.apache.org/
>>>>>>>>>
>>>>>>>>> The tag to be voted on is v3.1.1-rc3 (commit
>>>>>>>>> 1d550c4e90275ab418b9161925049239227f3dc9):
>>>>>>>>> https://github.com/apache/spark/tree/v3.1.1-rc3
>>>>>>>>>
>>>>>>>>> The release files, including signatures, digests, etc.
can be
>>>>>>>>> found at:
>>>>>>>>> <https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc1-bin/>
>>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-bin/
>>>>>>>>>
>>>>>>>>> Signatures used for Spark RCs can be found in this file:
>>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>>>>>>>
>>>>>>>>> The staging repository for this release can be found
at:
>>>>>>>>>
>>>>>>>>> https://repository.apache.org/content/repositories/orgapachespark-1367
>>>>>>>>>
>>>>>>>>> The documentation corresponding to this release can be
found at:
>>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-docs/
>>>>>>>>>
>>>>>>>>> The list of bug fixes going into 3.1.1 can be found at
the
>>>>>>>>> following URL:
>>>>>>>>> https://s.apache.org/41kf2
>>>>>>>>>
>>>>>>>>> This release is using the release script of the tag v3.1.1-rc3.
>>>>>>>>>
>>>>>>>>> FAQ
>>>>>>>>>
>>>>>>>>> ===================
>>>>>>>>> What happened to 3.1.0?
>>>>>>>>> ===================
>>>>>>>>>
>>>>>>>>> There was a technical issue during Apache Spark 3.1.0
preparation,
>>>>>>>>> and it was discussed and decided to skip 3.1.0.
>>>>>>>>> Please see
>>>>>>>>> https://spark.apache.org/news/next-official-release-spark-3.1.1.html
for
>>>>>>>>> more details.
>>>>>>>>>
>>>>>>>>> =========================
>>>>>>>>> How can I help test this release?
>>>>>>>>> =========================
>>>>>>>>>
>>>>>>>>> If you are a Spark user, you can help us test this release
by
>>>>>>>>> taking
>>>>>>>>> an existing Spark workload and running on this release
candidate,
>>>>>>>>> then
>>>>>>>>> reporting any regressions.
>>>>>>>>>
>>>>>>>>> If you're working in PySpark you can set up a virtual
env and
>>>>>>>>> install
>>>>>>>>> the current RC via "pip install
>>>>>>>>> https://dist.apache.org/repos/dist/dev/spark/v3.1.1-rc3-bin/pyspark-3.1.1.tar.gz
>>>>>>>>> "
>>>>>>>>> and see if anything important breaks.
>>>>>>>>> In the Java/Scala, you can add the staging repository
to your
>>>>>>>>> projects resolvers and test
>>>>>>>>> with the RC (make sure to clean up the artifact cache
before/after
>>>>>>>>> so
>>>>>>>>> you don't end up building with an out of date RC going
forward).
>>>>>>>>>
>>>>>>>>> ===========================================
>>>>>>>>> What should happen to JIRA tickets still targeting 3.1.1?
>>>>>>>>> ===========================================
>>>>>>>>>
>>>>>>>>> The current list of open tickets targeted at 3.1.1 can
be found at:
>>>>>>>>> https://issues.apache.org/jira/projects/SPARK and search
for
>>>>>>>>> "Target Version/s" = 3.1.1
>>>>>>>>>
>>>>>>>>> Committers should look at those and triage. Extremely
important bug
>>>>>>>>> fixes, documentation, and API tweaks that impact compatibility
>>>>>>>>> should
>>>>>>>>> be worked on immediately. Everything else please retarget
to an
>>>>>>>>> appropriate release.
>>>>>>>>>
>>>>>>>>> ==================
>>>>>>>>> But my bug isn't fixed?
>>>>>>>>> ==================
>>>>>>>>>
>>>>>>>>> In order to make timely releases, we will typically not
hold the
>>>>>>>>> release unless the bug in question is a regression from
the
>>>>>>>>> previous
>>>>>>>>> release. That being said, if there is something which
is a
>>>>>>>>> regression
>>>>>>>>> that has not been correctly targeted please ping me or
a committer
>>>>>>>>> to
>>>>>>>>> help target the issue.
>>>>>>>>>
>>>>>>>>
>>>

Mime
View raw message