spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Heo <jason.heo....@gmail.com>
Subject [Spark SQL] How to run a custom meta query for `ANALYZE TABLE`
Date Wed, 03 Jan 2018 04:17:15 GMT
Hi,

I'm working on integrating Spark and a custom data source.

Most things go well with nice Spark Data Source APIs (Thanks to well
designed APIs)

But, one thing I couldn't resolve is that how to execute custom meta query
for `ANALYZE TABLE`

The custom data source I'm currently working on has a meta query so we can
get MIN/MAX/Cardinality without full scan.

What I want to do is that when `ANALYZE TABLE` is executed over the custom
data source then execute custom meta query rather than executing Full
Scanning.

If this is not possible, I'm considering inserting stats into metastore_db
manually. Is there any API exposed to handle metastore_db (e.g.
insert/delete meta db)?

Regards,

Jason

Mime
View raw message