spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Litvak <boris.lit...@skf.com>
Subject Data Lakes using Spark
Date Wed, 07 Apr 2021 06:32:58 GMT
Hi Friends,

I’d like to publish a document to Medium about data lakes using Spark.
Its latter parts include info that is not widely known, unless you have experience with data
lakes.

https://github.com/borislitvak/datalake-article/blob/initial_comments/Building%20a%20Real%20Life%20Data%20Lake%20in%C2%A0AWS.md
I hope it’s OK if I ask you to review its draft.

You can respond here or contact me directly.
If there are some topics I should add (like, compaction effect on downstream reads using structured
streaming), or there are errors, please point them out before it gets out.
Also, if some points are unclear or misleading, please state so.

Thanks,

Boris Litvak
Mime
View raw message