spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <>
Subject Re: [Spark SQL] How to run a custom meta query for `ANALYZE TABLE`
Date Wed, 03 Jan 2018 06:22:32 GMT

No this is not possible with the current data source API. However, there is a new data source
API v2 on its way - maybe it will support it. 

Alternatively, you can have a config option to calculate meta data after an insert.

However, could you please explain more for which dB your datasource is and when this meta
query should be executed ?

> On 3. Jan 2018, at 05:17, Jason Heo <> wrote:
> Hi,
> I'm working on integrating Spark and a custom data source.
> Most things go well with nice Spark Data Source APIs (Thanks to well designed APIs)
> But, one thing I couldn't resolve is that how to execute custom meta query for `ANALYZE
> The custom data source I'm currently working on has a meta query so we can get MIN/MAX/Cardinality
without full scan.
> What I want to do is that when `ANALYZE TABLE` is executed over the custom data source
then execute custom meta query rather than executing Full Scanning.
> If this is not possible, I'm considering inserting stats into metastore_db manually.
Is there any API exposed to handle metastore_db (e.g. insert/delete meta db)?
> Regards,
> Jason

To unsubscribe e-mail:

View raw message