spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fiske, Danny" <Danny.Fi...@ext.ons.gov.uk>
Subject [PySpark] [SparkR] Is it possible to invoke a PySpark function with a SparkR DataFrame?
Date Mon, 15 Jul 2019 13:58:32 GMT
Hi all,

Forgive this naïveté, I'm looking for reassurance from some experts!

In the past we created a tailored Spark library for our organisation, implementing Spark functions
in Scala with Python and R "wrappers" on top, but the focus on Scala has alienated our analysts/statisticians/data
scientists and collaboration is important for us (yeah... we're aware that your SDKs are very
similar across languages... :/ ). We'd like to see if we could forego the Scala facet in order
to present the source code in a language more familiar to users and internal contributors.

We'd ideally write our functions with PySpark and potentially create a SparkR "wrapper" over
the top, leading to the question:

Given a function written with PySpark that accepts a DataFrame parameter, is there a way to
invoke this function using a SparkR DataFrame?

Is there any reason to pursue this? Is it even possible?

Many thanks,

Danny

For the latest data on the economy and society, consult our website at http://www.ons.gov.uk

***********************************************************************************************
Please Note:  Incoming and outgoing email messages are routinely monitored for compliance
with our policy
on the use of electronic communications

***********************************************************************************************

Legal Disclaimer:  Any views expressed by the sender of this message are not necessarily those
of the
Office for National Statistics
***********************************************************************************************

Mime
View raw message