I will create a jira ticket for exposing the APIs in a standard way.


On Fri, Oct 2, 2015
Yes, you could calculate an estimate for this information, but it isn't currently exposed through JDBC or through the explain plan (which would be a good place for it to live). You'd need to dip down to the implementation to get it. Something like this:

PhoenixStatement statement = connection.createStatement().unwrap(PhoenixStatement.class);
ResultSet rs = statement.executeQuery("EXPLAIN SELECT ...");
QueryPlan plan = statement.getQueryPlan();
List<KeyRange> ranges = plan.getSplits();

Each KeyRange in ranges will be going over a configurable amount of bytes (determined by phoenix.stats.guidepost.width and/or phoenix.stats.guidepost.per.region), so a simple worst case estimate would be to multiply the ranges.size() by this config value (using a default of QueryServicesOptions.DEFAULT_STATS_GUIDEPOST_WIDTH_BYTES or 300MB). If the query is a point lookup (which you can check with plan.getContext().getScanRanges().isPointLookup()), then the cost would be ranges.size() * average_row_size.

Since these aren't exposed APIs, they're subject to change. Please file a JIRA if you're interested in helping figure out what the "official" APIs for this should be.

On Fri, Oct 2, 2015
Is there a way to figure out how many rows/cells were scanned in hbase perform a phoenix query? I tried using the explain command, but, it is not clear how to estimate the number of rows touched by looking at the explain plan. Essentially, I want to be able to report back to users the "cost" of performing a phoenix query, where "cost" is some function of rows/cells scanned.