spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Solimando <alessandro.solima...@gmail.com>
Subject Re: Array indexing functions
Date Tue, 20 Nov 2018 10:34:41 GMT
Hi Petar,
I have implemented similar functions a few times through ad-hoc UDFs in the
past, so +1 from me.

Can you elaborate a bit more on how you practically implement those
functions? Are they UDF or "native" functions like those in sql.functions
package?

I am asking because I wonder if/how Catalyst can take those functions into
account for producing more optimized plans, maybe you or someone else in
the list can clarify this.

Best regards,
Alessandro

On Tue, 20 Nov 2018 at 11:11, Petar Zečević <petar.zecevic@gmail.com> wrote:

>
> Hi,
> I implemented two array functions that are useful to us and I wonder if
> you think it would be useful to add them to the distribution. The functions
> are used for filtering arrays based on indexes:
>
> array_allpositions (named after array_position) - takes a column and a
> value and returns an array of the column's indexes corresponding to
> elements equal to the provided value
>
> array_select - takes an array column and an array of indexes and returns a
> subset of the array based on the provided indexes.
>
> If you agree with this addition I can create a JIRA ticket and a pull
> request.
>
> --
> Petar Zečević
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>

Mime
View raw message