We successfully completed Apache Spark 3.1.1 and 3.0.2 releases and started 3.2.0 discussion already.
Let's talk about branch-2.4 because there exists some discussions on JIRA and GitHub about skipping backporting to 2.4.
Since `branch-2.4` has been maintained well as LTS, I'd like to suggest having Apache Spark 2.4.8 release as the official EOL release of 2.4 line in order to focus on 3.x more from now. Please note that `branch-2.4` will be frozen officially like `branch-2.3` after EOL release.
- Apache Spark 2.4.0 was released on November 2, 2018.
- Apache Spark 2.4.7 was released on September 12, 2020.
- Since v2.4.7 tag, `branch-2.4` has 134 commits including the following 12 correctness issues.
## CORRECTNESS ISSUE
SPARK-30201 HiveOutputWriter standardOI should use ObjectInspectorCopyOption.DEFAULT
SPARK-30228 Update zstd-jni to 1.4.4-3
SPARK-30894 The nullability of Size function should not depend on SQLConf.get
SPARK-32635 When pyspark.sql.functions.lit() function is used with dataframe cache, it returns wrong result
SPARK-32908 percentile_approx() returns incorrect results
SPARK-33183 Bug in optimizer rule EliminateSorts
SPARK-33290 REFRESH TABLE should invalidate cache even though the table itself may not be cached
SPARK-33593 Vector reader got incorrect data with binary partition value
SPARK-33726 Duplicate field names causes wrong answers during aggregation
SPARK-34187 Use available offset range obtained during polling when checking offset validation
SPARK-34212 For parquet table, after changing the precision and scale of decimal type in hive, spark reads incorrect value
SPARK-34229 Avro should read decimal values with the file schema
## SECURITY ISSUE
SPARK-33333 Upgrade Jetty to 9.4.28.v20200408
SPARK-33831 Update to jetty 9.4.34
SPARK-34449 Upgrade Jetty to fix CVE-2020-27218
What do you think about this?