datafu-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Jurney <>
Subject DataFu and Spark
Date Wed, 15 Feb 2017 01:08:14 GMT
I think DataFu needs to support Spark, to include additions/UDFs for Spark,
to continue to thrive as a project. Pig has been abandoned by Hortonworks
and some others, and I'm not sure it will continue to thrive in the future.

Personally, I work in PySpark and will try to come up with a list of five
additions I wish Spark had that aren't likely to be accepted as direct
additions to the API. I can think of utils for joining nested/complex RDDs
that I would like to see, in particular. I'll think of some others.

Can any Scala Spark users do the same for Scala Spark? Make up a list of
five additions you would like to see DataFu make to Scala.

If anyone has thoughts on DataFu and Spark, please lets hear them. How
would Spark in DataFu work?

Russell Jurney @rjurney <> LI <> FB

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message