spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gourav Sengupta <gourav.sengu...@gmail.com>
Subject Re: [EXTERNAL] [Marketing Mail] Reading SPARK 3.1.x generated parquet in SPARK 2.4.x
Date Thu, 12 Aug 2021 15:49:27 GMT
Hi Saurabh,

a very big note of thanks from Gourav :)

Regards,
Gourav Sengupta

On Thu, Aug 12, 2021 at 4:16 PM Saurabh Gulati
<saurabh.gulati@fedex.com.invalid> wrote:

> We had issues with this migration mainly because of changes in spark date
> calendars. See
> <https://www.waitingforcode.com/apache-spark-sql/whats-new-apache-spark-3-proleptic-calendar-date-time-management/read>
> We got this working by setting the below params:
>
> ("spark.sql.legacy.parquet.datetimeRebaseModeInRead", "LEGACY"),
> ("spark.sql.legacy.parquet.datetimeRebaseModeInWrite", "CORRECTED"),
> ("spark.sql.legacy.parquet.int96RebaseModeInRead", "LEGACY"),
> ("spark.sql.legacy.parquet.int96RebaseModeInWrite", "CORRECTED")
>
>
>
> But otherwise, it's a change for good. Performance seems better.
> Also, there were bugs in 3.0.1 which have been addressed in 3.1.1.
> ------------------------------
> *From:* Gourav Sengupta <gourav.sengupta.developer@gmail.com>
> *Sent:* 05 August 2021 10:17
> *To:* user @spark <user@spark.apache.org>
> *Subject:* [EXTERNAL] [Marketing Mail] Reading SPARK 3.1.x generated
> parquet in SPARK 2.4.x
>
> *Caution! This email originated outside of FedEx. Please do not open
> attachments or click links from an unknown or suspicious origin*.
> Hi,
>
> we are trying to migrate some of the data lake pipelines to run in SPARK
> 3.x, where as the dependent pipelines using those tables will be still
> running in SPARK 2.4.x for sometime to come.
>
> Does anyone know of any issues that can happen:
> 1. when reading Parquet files written in 3.1.x in SPARK 2.4
> 2. when in the data lake some partitions have parquet files written in
> SPARK 2.4.x and some are in SPARK 3.1.x.
>
> Please note that there are no changes in schema, but later on we might end
> up adding or removing some columns.
>
> I will be really grateful for your kind help on this.
>
> Regards,
> Gourav Sengupta
>

Mime
View raw message