spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-19627) pyspark call jvm function defined by ourselves
Date Thu, 16 Feb 2017 11:55:42 GMT

     [ https://issues.apache.org/jira/browse/SPARK-19627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sean Owen resolved SPARK-19627.
-------------------------------
          Resolution: Invalid
       Fix Version/s:     (was: 1.6.1)
    Target Version/s:   (was: 1.6.1)

Please read http://spark.apache.org/contributing.html first

> pyspark call jvm function defined by ourselves
> ----------------------------------------------
>
>                 Key: SPARK-19627
>                 URL: https://issues.apache.org/jira/browse/SPARK-19627
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy
>    Affects Versions: 1.6.1
>            Reporter: kehao
>
> hi, I have a question that pyspark couldn't execute suceess by call jvm's function defined
by myself, please view the code below:
> from pyspark import SparkConf,SparkContext
> from py4j.java_gateway import java_import
> if __name__ == "__main__":
> #    conf = SparkConf().setAppName("testing")
> #    sc = SparkContext(conf=conf)
>     sc = SparkContext(appName="Py4jTesting")
>     def foo(x):
>         java_import(sc._jvm, "Calculate")
>         func = sc._jvm.Calculate()
>         func.sqAdd(x)
>     rdd = sc.parallelize([1, 2, 3])
>     result = rdd.map(foo).collect()
>     print("$$$$$$$$$$$$$$$$$$$$$$")
>     print(result)
> the result shows as below ,who can help me?
> Traceback (most recent call last):
>   File "/home/manager/data/software/mytest/kehao/driver.py", line 19, in <module>
>     result = rdd.map(foo).collect()
>   File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
line 771, in collect
>   File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
line 2379, in _jrdd
>   File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
line 2299, in _prepare_for_python_RDD
>   File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/serializers.py",
line 428, in dumps
>   File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 646, in dumps
>   File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 107, in dump
>   File "/usr/lib/python3.4/pickle.py", line 412, in dump
>     self.save(obj)
>   File "/usr/lib/python3.4/pickle.py", line 479, in save
>     f(self, obj) # Call unbound method with explicit self
>   File "/usr/lib/python3.4/pickle.py", line 744, in save_tuple
>     save(element)
>   File "/usr/lib/python3.4/pickle.py", line 479, in save
>     f(self, obj) # Call unbound method with explicit self
>   File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 199, in save_function
>   File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 236, in save_function_tuple
>   File "/usr/lib/python3.4/pickle.py", line 479, in save
>     f(self, obj) # Call unbound method with explicit self
>   File "/usr/lib/python3.4/pickle.py", line 729, in save_tuple
>     save(element)
>   File "/usr/lib/python3.4/pickle.py", line 479, in save
>     f(self, obj) # Call unbound method with explicit self
>   File "/usr/lib/python3.4/pickle.py", line 774, in save_list
>     self._batch_appends(obj)
>   File "/usr/lib/python3.4/pickle.py", line 801, in _batch_appends
>     save(tmp[0])
>   File "/usr/lib/python3.4/pickle.py", line 479, in save
>     f(self, obj) # Call unbound method with explicit self
>   File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 193, in save_function
>   File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py",
line 241, in save_function_tuple
>   File "/usr/lib/python3.4/pickle.py", line 479, in save
>     f(self, obj) # Call unbound method with explicit self
>   File "/usr/lib/python3.4/pickle.py", line 814, in save_dict
>     self._batch_setitems(obj.items())
>   File "/usr/lib/python3.4/pickle.py", line 840, in _batch_setitems
>     save(v)
>   File "/usr/lib/python3.4/pickle.py", line 499, in save
>     rv = reduce(self.proto)
>   File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/context.py",
line 268, in __getnewargs__
> Exception: It appears that you are attempting to reference SparkContext from a broadcast
variable, action, or transformation. SparkContext can only be used on the driver, not in code
that it run on workers. For more information, see SPARK-5063



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message