Hi 

I'm accessing multiple regions (~5k) of an HBase table using spark's newAPIHadoopRDD. But the driver is trying to calculate the region size of all the regions.
It is not even reusing the hconnection and creting a new connection for every request (see below) which is taking lots of time.

Is there a better approach to do this?


8 Nov 2016 22:25:22,759] [INFO Driver] RecoverableZooKeeper: Process identifier=hconnection-0x1e7824af connecting to ZooKeeper ensemble=hbase19.cloud.com:2181,hbase24.cloud.com:2181,hbase28.cloud.com:2181
[18 Nov 2016 22:25:22,759] [INFO Driver] ZooKeeper: Initiating client connection, connectString=hbase19.cloud.com:2181,hbase24.cloud.com:2181,hbase28.cloud.com:2181 sessionTimeout=60000 watcher=hconnection-0x1e7824af0x0, quorum=hbase19.cloud.com:2181,hbase24.cloud.com:2181,hbase28.cloud.com:2181, baseZNode=/hbase
[18 Nov 2016 22:25:22,761] [INFO Driver-SendThread(hbase24.cloud.com:2181)] ClientCnxn: Opening socket connection to server hbase24.cloud.com/10.193.150.217:2181. Will not attempt to authenticate using SASL (unknown error)
[18 Nov 2016 22:25:22,763] [INFO Driver-SendThread(hbase24.cloud.com:2181)] ClientCnxn: Socket connection established, initiating session, client: /10.193.138.145:47891, server: hbase24.cloud.com/10.193.150.217:2181
[18 Nov 2016 22:25:22,766] [INFO Driver-SendThread(hbase24.cloud.com:2181)] ClientCnxn: Session establishment complete on server hbase24.cloud.com/10.193.150.217:2181, sessionid = 0x2564f6f013e0e95, negotiated timeout = 60000
[18 Nov 2016 22:25:22,766] [INFO Driver] RegionSizeCalculator: Calculating region sizes for table "message".
[18 Nov 2016 22:25:27,867] [INFO Driver] ConnectionManager$HConnectionImplementation: Closing master protocol: MasterService
[18 Nov 2016 22:25:27,868] [INFO Driver] ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x2564f6f013e0e95
[18 Nov 2016 22:25:27,869] [INFO Driver] ZooKeeper: Session: 0x2564f6f013e0e95 closed
[18 Nov 2016 22:25:27,869] [INFO Driver-EventThread] ClientCnxn: EventThread shut down
[18 Nov 2016 22:25:27,880] [INFO Driver] RecoverableZooKeeper: Process identifier=hconnection-0x6a8a1efa connecting to ZooKeeper ensemble=hbase19.cloud.com:2181,hbase24.cloud.com:2181,hbase28.cloud.com:2181
[18 Nov 2016 22:25:27,880] [INFO Driver] ZooKeeper: Initiating client connection, connectString=hbase19.cloud.com:2181,hbase24.cloud.com:2181,hbase28.cloud.com:2181 sessionTimeout=60000 watcher=hconnection-0x6a8a1efa0x0, quorum=hbase19.cloud.com:2181,hbase24.cloud.com:2181,hbase28.cloud.com:2181, baseZNode=/hbase
[18 Nov 2016 22:25:27,883] [INFO Driver-SendThread(hbase24.cloud.com:2181)] ClientCnxn: Opening socket connection to server hbase24.cloud.com/10.193.150.217:2181. Will not attempt to authenticate using SASL (unknown error)
[18 Nov 2016 22:25:27,885] [INFO Driver-SendThread(hbase24.cloud.com:2181)] ClientCnxn: Socket connection established, initiating session, client: /10.193.138.145:47894, server: hbase24.cloud.com/10.193.150.217:2181
[18 Nov 2016 22:25:27,887] [INFO Driver-SendThread(hbase24.cloud.com:2181)] ClientCnxn: Session establishment complete on server hbase24.cloud.com/10.193.150.217:2181, sessionid = 0x2564f6f013e0e97, negotiated timeout = 60000
[18 Nov 2016 22:25:27,888] [INFO Driver] RegionSizeCalculator: Calculating region sizes for table "message".
....

-- 
Thanks & Regards,