spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Boesch <>
Subject Setup/Cleanup for RDD closures?
Date Fri, 03 Oct 2014 04:46:39 GMT
Consider there is some connection / external resource allocation required
to be accessed/mutated by each of the rows from within a single worker
thread.  That connection should only  be opened/closed before the first row
is accessed / after the last row is completed.

It is my understanding that there is work presently underway (Reynold Xin
and others)  on defining an external resources API to address this. What is
the recommended approach in the meanwhile?

View raw message