spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Blue <b...@apache.org>
Subject [DISCUSS] SPIP: FunctionCatalog
Date Mon, 08 Feb 2021 18:45:16 GMT
Hi everyone,

I'd like to start a discussion for adding a FunctionCatalog interface to
catalog plugins. This will allow catalogs to expose functions to Spark,
similar to how the TableCatalog interface allows a catalog to expose
tables. The proposal doc is available here:
https://docs.google.com/document/d/1PLBieHIlxZjmoUB0ERF-VozCRJ0xw2j3qKvUNWpWA2U/edit

Here's a high-level summary of some of the main design choices:
* Adds the ability to list and load functions, not to create or modify them
in an external catalog
* Supports scalar, aggregate, and partial aggregate functions
* Uses load and bind steps for better error messages and simpler
implementations
* Like the DSv2 table read and write APIs, it uses InternalRow to pass data
* Can be extended using mix-in interfaces to add vectorization, codegen,
and other future features

There is also a PR with the proposed API:
https://github.com/apache/spark/pull/24559/files

Let's discuss the proposal here rather than on that PR, to get better
visibility. Also, please take the time to read the proposal first. That
really helps clear up misconceptions.



-- 
Ryan Blue

Mime
View raw message