spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wenchen Fan <>
Subject Re: [DISCUSS] Add close() on DataWriter interface
Date Wed, 11 Dec 2019 07:41:37 GMT
PartitionReader extends Closable, seems reasonable to me to do the same
for DataWriter.

On Wed, Dec 11, 2019 at 1:35 PM Jungtaek Lim <>

> Hi devs,
> I'd like to propose to add close() on DataWriter explicitly, which is the
> place for resource cleanup.
> The rationalization of the proposal is due to the lifecycle of DataWriter.
> If the scaladoc of DataWriter is correct, the lifecycle of DataWriter
> instance ends at either commit() or abort(). That makes datasource
> implementors to feel they can place resource cleanup in both sides, but
> abort() can be called when commit() fails; so they have to ensure they
> don't do double-cleanup if cleanup is not idempotent.
> I've checked some callers to see whether they can apply
> "try-catch-finally" to ensure close() is called at the end of lifecycle for
> DataWriter, and they look like so, but I might be missing something.
> What do you think? It would bring backward incompatible change, but given
> the interface is marked as Evolving and we're making backward incompatible
> changes in Spark 3.0, so I feel it may not matter.
> Would love to hear your thoughts.
> Thanks in advance,
> Jungtaek Lim (HeartSaVioR)

View raw message