spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simone Robutti <simone.robu...@gmail.com>
Subject Collecting matrix's entries raises an error only when run inside a test
Date Wed, 05 Jul 2017 14:52:13 GMT
Hello, I have this problem and  Google is not helping. Instead, it looks
like an unreported bug and there are no hints to possible workarounds.

the error is the following:

Traceback (most recent call last):
  File
"/home/simone/motionlogic/trip-labeler/test/trip_labeler_test/model_test.py",
line 43, in test_make_trip_matrix
    entries = trip_matrix.entries.map(lambda entry: (entry.i, entry.j,
entry.value)).collect()
  File
"/opt/spark-1.6.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py",
line 770, in collect
    with SCCallSiteSync(self.context) as css:
  File
"/opt/spark-1.6.2-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/traceback_utils.py",
line 72, in __enter__
    self._context._jsc.setCallSite(self._call_site)
AttributeError: 'NoneType' object has no attribute 'setCallSite'

and it is raised when I try to collect a
pyspark.mllib.linalg.distributed.CoordinateMatrix entries with .collect()
and it happens only when this run in a test suite with more than one class,
so it's probably related to the creation and destruction of SparkContexts
but I cannot understand how.

Spark version is 1.6.2

I saw multiple references to this error for other classses in the pyspark
ml library but none of them contained hints toward the solution.

I'm running tests through nosetests when it breaks. Running a single
TestCase in Intellij works fine.

Is there a known solution? Is it a known problem?

Thank you,

Simone

Mime
View raw message