spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Huiliang Zhang <>
Subject Resource manage inside map function
Date Sat, 31 Mar 2018 00:54:15 GMT

I have a spark job which needs to access HBase inside a mapToPair function. The
question is that I do not want to connect to HBase and close connection
each time.

As I understand, PairFunction is not designed to manage resources with
setup() and close(), like Hadoop reader and writer.

Does spark support this kind of resource manage? Your help is appreciated!

By the way, the reason I do not want to use writer is that I want to return
some metric values after processing. The returned metric values will be
further processed. Basically, it is not desirable to use HDFS as transfer


View raw message