spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Petar Zečević <petar.zece...@gmail.com>
Subject Re: Array indexing functions
Date Thu, 07 Feb 2019 08:19:48 GMT

Hi,
as far as I know these are not standard functions.

Writing UDFs is easy, but only in Java and Scala is it equally efficient as a built-in function.
When using Python, data movement/conversion to/from Arrow is still necessary, and that makes
a difference in performance. That was the motivation behind these two.

I'd object to the rule of not implementing functions not found anywhere else, but there seems
to be a consensus around this, so I'll just close the JIRA.

Thanks,
Petar


Sean Owen <srowen@gmail.com> writes:

> Is it standard SQL or implemented in Hive? Because UDFs are so relatively easy in Spark
we don't need tons of builtins like an RDBMS does. 
>
> On Tue, Feb 5, 2019, 7:43 AM Petar Zečević <petar.zecevic@gmail.com wrote:
>
>  Hi everybody,
>  I finally created the JIRA ticket and the pull request for the two array indexing functions:
>  https://issues.apache.org/jira/browse/SPARK-26826
>
>  Can any of the committers please check it out?
>
>  Thanks,
>  Petar
>
>  Petar Zečević <petar.zecevic@gmail.com> writes:
>
>  > Hi,
>  > I implemented two array functions that are useful to us and I wonder if you think
it would be useful to add them to the distribution. The functions are used for filtering arrays
based on indexes:
>  >
>  > array_allpositions (named after array_position) - takes a column and a value and
returns an array of the column's indexes corresponding to elements equal to the provided value
>  >
>  > array_select - takes an array column and an array of indexes and returns a subset
of the array based on the provided indexes.
>  >
>  > If you agree with this addition I can create a JIRA ticket and a pull request.
>
>  ---------------------------------------------------------------------
>  To unsubscribe e-mail: dev-unsubscribe@spark.apache.org




---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message