hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: How to get size of Hbase Table
Date Thu, 21 Jul 2016 22:54:42 GMT
Please take a look at the following methods:

>From HBaseAdmin:

  public List<HRegionInfo> getTableRegions(final TableName tableName)

>From HRegion:

  public static HDFSBlocksDistribution computeHDFSBlocksDistribution(final
Configuration conf,

      final HTableDescriptor tableDescriptor, final HRegionInfo regionInfo)
throws IOException {


On Wed, Jul 20, 2016 at 11:28 PM, Sachin Jain <sachinjain024@gmail.com>

> *Context*
> I am using Spark (1.5.1) with HBase (1.1.2) to dump the output of Spark
> Jobs into HBase which will be further available as lookups from HBase
> Table. BaseRelation extends HadoopFSRelation and is used to read and write
> to HBase. Spark Default Source API is used.
> *Use Case*
> Now, whenever I perform join operation, Spark creates a logical plan and
> decides which type of join it should execute and as per Spark Strategies
> [0] it checks the size of HBase Table. If it is less than some threshold
> (10 MB) it selects Broadcast Hash join otherwise Sort Merge join.
> *Problem Statement*
> I want to know if there is an API or some approach to calculate the size of
> an HBase table.
> [0]:
> https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala#L118
> Thanks
> -Sachin

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message