hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sachin Jain <sachinjain...@gmail.com>
Subject How to get size of Hbase Table
Date Thu, 21 Jul 2016 06:28:14 GMT
*Context*
I am using Spark (1.5.1) with HBase (1.1.2) to dump the output of Spark
Jobs into HBase which will be further available as lookups from HBase
Table. BaseRelation extends HadoopFSRelation and is used to read and write
to HBase. Spark Default Source API is used.

*Use Case*
Now, whenever I perform join operation, Spark creates a logical plan and
decides which type of join it should execute and as per Spark Strategies
[0] it checks the size of HBase Table. If it is less than some threshold
(10 MB) it selects Broadcast Hash join otherwise Sort Merge join.

*Problem Statement*
I want to know if there is an API or some approach to calculate the size of
an HBase table.

[0]:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala#L118

Thanks
-Sachin

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message