spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayur Rustagi <mayur.rust...@gmail.com>
Subject Re: Setup/Cleanup for RDD closures?
Date Fri, 03 Oct 2014 08:58:06 GMT
Current approach is to use mappartition, initialize the connection in the
beginning, iterate through the data & close off the connector.


Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>


On Fri, Oct 3, 2014 at 10:16 AM, Stephen Boesch <javadba@gmail.com> wrote:

>
> Consider there is some connection / external resource allocation required
> to be accessed/mutated by each of the rows from within a single worker
> thread.  That connection should only  be opened/closed before the first row
> is accessed / after the last row is completed.
>
> It is my understanding that there is work presently underway (Reynold Xin
> and others)  on defining an external resources API to address this. What is
> the recommended approach in the meanwhile?
>

Mime
View raw message