spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Taoufik Dachraoui <dachraoui.taou...@gmail.com>
Subject Re: Requesting a Plan for Avro-typed Datasets
Date Thu, 16 May 2019 20:12:02 GMT
Hi

Please also consider the other 2 alternatives for statically typed Datasets
of Avro objects

https://github.com/apache/spark/pull/24299
and https://github.com/apache/spark/pull/24367

kind regards

-Taoufik


On Thu, May 16, 2019 at 9:59 PM Aleksander Eskilson <aleksanderesk@gmail.com>
wrote:

> Hi all,
>
> There's been longstanding demand for statically typed Datasets of Avro.
> Functionality from the now-deprecated Databricks Spark-Avro project was
> folded into Spark, but can still only provide DataFrames over Avro data. As
> is discussed in the PR below, there are still drawbacks from not having
> fully, statically typed Datasets of Avro.
>
> There's an open PR adding a first-class Encoder for statically typed
> Datasets of Avro:
>
> https://github.com/apache/spark/pull/22878 :
> https://issues.apache.org/jira/browse/SPARK-25789 (originally in
> Databricks/spark-avro, https://github.com/databricks/spark-avro/pull/217
>  : https://github.com/databricks/spark-avro/issues/169)
>
> We've tested the content of this PR widely over complex, deeply nested,
> Avro structures. It seems ready for a last review and nearly ready for
> merger.
>
> Alek Eskilson
> github : bdrillard
>


-- 
Taoufik Dachraoui

Mime
View raw message