Thanks James.

I will create a jira ticket for exposing the APIs in a standard way.

Alok


On Fri, Oct 2, 2015 at 6:20 PM, James Taylor <jamestaylor@apache.org> wrote:
Hi Alok,
Yes, you could calculate an estimate for this information, but it isn't currently exposed through JDBC or through the explain plan (which would be a good place for it to live). You'd need to dip down to the implementation to get it. Something like this:

PhoenixStatement statement = connection.createStatement().unwrap(PhoenixStatement.class);
ResultSet rs = statement.executeQuery("EXPLAIN SELECT ...");
QueryPlan plan = statement.getQueryPlan();
List<KeyRange> ranges = plan.getSplits();

Each KeyRange in ranges will be going over a configurable amount of bytes (determined by phoenix.stats.guidepost.width and/or phoenix.stats.guidepost.per.region), so a simple worst case estimate would be to multiply the ranges.size() by this config value (using a default of QueryServicesOptions.DEFAULT_STATS_GUIDEPOST_WIDTH_BYTES or 300MB). If the query is a point lookup (which you can check with plan.getContext().getScanRanges().isPointLookup()), then the cost would be ranges.size() * average_row_size.

Since these aren't exposed APIs, they're subject to change. Please file a JIRA if you're interested in helping figure out what the "official" APIs for this should be.

HTH. Thanks,

    James

On Fri, Oct 2, 2015 at 5:35 PM, Alok Singh <alok@cloudability.com> wrote:
Is there a way to figure out how many rows/cells were scanned in hbase perform a phoenix query? I tried using the explain command, but, it is not clear how to estimate the number of rows touched by looking at the explain plan. Essentially, I want to be able to report back to users the "cost" of performing a phoenix query, where "cost" is some function of rows/cells scanned.