spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reynold Xin <r...@databricks.com>
Subject Re: DataSourceWriter V2 Api questions
Date Mon, 10 Sep 2018 16:05:28 GMT
Typically people do it via transactions, or staging tables.


On Mon, Sep 10, 2018 at 2:07 AM Ross Lawley <ross.lawley@gmail.com> wrote:

> Hi all,
>
> I've been prototyping an implementation of the DataSource V2 writer for
> the MongoDB Spark Connector and I have a couple of questions about how its
> intended to be used with database systems. According to the Javadoc for
> DataWriter.commit():
>
>
> *"this method should still "hide" the written data and ask the
> DataSourceWriter at driver side to do the final commit via
> WriterCommitMessage"*
>
> Although, MongoDB now has transactions, it doesn't have a way to "hide"
> the data once it has been written. So as soon as the DataWriter has
> committed the data, it has been inserted/updated in the collection and is
> discoverable - thereby breaking the documented contract.
>
> I was wondering how other databases systems plan to implement this API and
> meet the contract as per the Javadoc?
>
> Many thanks
>
> Ross
>

Mime
View raw message