spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mtheofilos <>
Subject HBase Thrift API Error on map/reduce functions
Date Fri, 30 Jan 2015 11:40:13 GMT
I get a serialization problem trying to run

sc.parallelize(['1','2']).map(lambda id: client.getRow('table', id, None)) can't pickle method_descriptor type
I add a function to pickle a method descriptor and now it exceeds the
recursion limit
I print the method name before i pickle it and it is "reset" from
cStringIO.StringO (output)
The problem was at line ~830 of cloudpickle, trying to pickle a file
And the initial object to pickle was that:
(<function func at somewhere>, None, PairDeserializer(UTF8Deserializer(),
UTF8Deserializer()), BatchedSerializer(PickleSerializer(), 0))

And the error is this:
  File "/home/user/", line 80, in <module>
  File "/home/user/spark2/python/pyspark/", line 1081, in take
    totalParts = self._jrdd.partitions().size()
  File "/home/user/spark2/python/pyspark/", line 2107, in _jrdd
    pickled_command = ser.dumps(command)
  File "/home/user/spark2/python/pyspark/", line 402, in dumps
    return cloudpickle.dumps(obj, 2)
  File "/home/user/spark2/python/pyspark/", line 832, in dumps
  File "/home/user/spark2/python/pyspark/", line 147, in dump
    raise pickle.PicklingError(msg)
pickle.PicklingError: Could not pickle object as excessively deep recursion
                Try _fast_serialization=2 or contact PiCloud support

Can any developer that works in that stuff tell me if that problem can be

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message