spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felix Cheung <>
Subject Re: [PySpark] [SparkR] Is it possible to invoke a PySpark function with a SparkR DataFrame?
Date Tue, 16 Jul 2019 16:11:41 GMT
Not currently in Spark.

However, there are systems out there that can share DataFrame between languages on top of
Spark - it’s not calling the python UDF directly but you can pass the DataFrame to python
and then .map(UDF) that way.

From: Fiske, Danny <>
Sent: Monday, July 15, 2019 6:58:32 AM
Subject: [PySpark] [SparkR] Is it possible to invoke a PySpark function with a SparkR DataFrame?

Hi all,

Forgive this naïveté, I’m looking for reassurance from some experts!

In the past we created a tailored Spark library for our organisation, implementing Spark functions
in Scala with Python and R “wrappers” on top, but the focus on Scala has alienated our
analysts/statisticians/data scientists and collaboration is important for us (yeah… we’re
aware that your SDKs are very similar across languages… :/ ). We’d like to see if we could
forego the Scala facet in order to present the source code in a language more familiar to
users and internal contributors.

We’d ideally write our functions with PySpark and potentially create a SparkR “wrapper”
over the top, leading to the question:

Given a function written with PySpark that accepts a DataFrame parameter, is there a way to
invoke this function using a SparkR DataFrame?

Is there any reason to pursue this? Is it even possible?

Many thanks,


For the latest data on the economy and society, consult our website at<>

Please Note:  Incoming and outgoing email messages are routinely monitored for compliance
with our policy on the use of electronic communications


Legal Disclaimer:  Any views expressed by the sender of this message are not necessarily those
of the Office for National Statistics

View raw message