spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Holden Karau <>
Subject Re: [DISCUSS] SPIP: FunctionCatalog
Date Tue, 09 Feb 2021 18:21:07 GMT
I think this proposal is a good set of trade-offs and has existed in the
community for a long period of time. I especially appreciate how the design
is focused on a minimal useful component, with future optimizations
considered from a point of view of making sure it's flexible, but actual
concrete decisions left for the future once we see how this API is used. I
think if we try and optimize everything right out of the gate, we'll
quickly get stuck (again) and not make any progress.

On Mon, Feb 8, 2021 at 10:46 AM Ryan Blue <> wrote:

> Hi everyone,
> I'd like to start a discussion for adding a FunctionCatalog interface to
> catalog plugins. This will allow catalogs to expose functions to Spark,
> similar to how the TableCatalog interface allows a catalog to expose
> tables. The proposal doc is available here:
> Here's a high-level summary of some of the main design choices:
> * Adds the ability to list and load functions, not to create or modify
> them in an external catalog
> * Supports scalar, aggregate, and partial aggregate functions
> * Uses load and bind steps for better error messages and simpler
> implementations
> * Like the DSv2 table read and write APIs, it uses InternalRow to pass data
> * Can be extended using mix-in interfaces to add vectorization, codegen,
> and other future features
> There is also a PR with the proposed API:
> Let's discuss the proposal here rather than on that PR, to get better
> visibility. Also, please take the time to read the proposal first. That
> really helps clear up misconceptions.
> --
> Ryan Blue

Books (Learning Spark, High Performance Spark, etc.):  <>
YouTube Live Streams:

View raw message