spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DB Tsai <>
Subject Re: Will higher order functions in spark SQL be pushed upstream?
Date Tue, 10 Oct 2017 07:33:12 GMT

At Netflix's algorithm team, we work on ranking problems a lot where
we naturally deal with the dataset with nested list of the structs. We
built Scala APIs like map, filter, drop, withColumn that can work on
the nested list of structs efficiently using SQL expression with

Here is what we purpose on how APIs will look like, and we would like
to socialize with community to get more feedback!

It will be cool to share some building blocks with Databricks's higher
order function feature.


On Fri, Jun 9, 2017 at 5:04 PM, Antoine HOM <> wrote:
> Good news :) Thx Sameer.
> On Friday, June 9, 2017, Sameer Agarwal <> wrote:
>>> * As a heavy user of complex data types I was wondering if there was
>>> any plan to push those changes upstream?
>> Yes, we intend to contribute this to open source.
>>> * In addition, I was wondering if as part of this change it also tries
>>> to solve the column pruning / filter pushdown issues with complex
>>> datatypes?
>> For parquet, this effort is primarily tracked via SPARK-4502 (see
>> and is currently targeted for
>> 2.3.


DB Tsai
PGP Key ID: 0x5CED8B896A6BDFA0

To unsubscribe e-mail:

View raw message