spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Paris <nicolas.pa...@riseup.net>
Subject Re: Announcing Delta Lake 0.3.0
Date Tue, 06 Aug 2019 21:37:39 GMT
>   • Scala/Java APIs for DML commands - You can now modify data in Delta Lake
>     tables using programmatic APIs for Delete, Update and Merge. These APIs
>     mirror the syntax and semantics of their corresponding SQL commands and are
>     great for many workloads, e.g., Slowly Changing Dimension (SCD) operations,
>     merging change data for replication, and upserts from streaming queries.
>     See the documentation for more details.

just tested the merge feature on a large table: awesome
- fast to build
- fast to query afterward
- robust (version history is an incredible feature)


thanks


On Thu, Aug 01, 2019 at 06:44:30PM -0700, Tathagata Das wrote:
> Hello everyone, 
> 
> We are excited to announce the availability of Delta Lake 0.3.0 which
> introduces new programmatic APIs for manipulating and managing data in Delta
> Lake tables.
> 
> 
> Here are the main features: 
> 
> 
>   • Scala/Java APIs for DML commands - You can now modify data in Delta Lake
>     tables using programmatic APIs for Delete, Update and Merge. These APIs
>     mirror the syntax and semantics of their corresponding SQL commands and are
>     great for many workloads, e.g., Slowly Changing Dimension (SCD) operations,
>     merging change data for replication, and upserts from streaming queries.
>     See the documentation for more details.
> 
> 
>   • Scala/Java APIs for query commit history - You can now query a table’s
>     commit history to see what operations modified the table. This enables you
>     to audit data changes, time travel queries on specific versions, debug and
>     recover data from accidental deletions, etc. See the documentation for more
>     details.
> 
> 
>   • Scala/Java APIs for vacuuming old files - Delta Lake uses MVCC to enable
>     snapshot isolation and time travel. However, keeping all versions of a
>     table forever can be prohibitively expensive. Stale snapshots (as well as
>     other uncommitted files from aborted transactions) can be garbage collected
>     by vacuuming the table. See the documentation for more details.
> 
> 
> To try out Delta Lake 0.3.0, please follow the Delta Lake Quickstart: https://
> docs.delta.io/0.3.0/quick-start.html
> 
> To view the release notes:
> https://github.com/delta-io/delta/releases/tag/v0.3.0
> 
> We would like to thank all the community members for contributing to this
> release.
> 
> TD

-- 
nicolas

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message