From "" <>
Subject ImportError: No module named iter ... (on CDH5 v1.2.0+cdh5.3.2+369-1.cdh5.3.2.p0.17.el6.noarch) ...
Date Wed, 04 Mar 2015 00:21:34 GMT
Hi Friends:

We noticed the following in 'pyspark' happens when running in 
distributed Standalone Mode (MASTER=spark://vps00:7077),
but not in Local Mode (MASTER=local[n]).

See the following, particularly what is highlighted in *Red* (again the 
problem only happens in Standalone Mode).
Any ideas? Thank you in advance! =:)

 >>> rdd = sc.textFile('file:///etc/hosts')
 >>> rdd.first()

Traceback (most recent call last):
   File "<input>", line 1, in <module>
   File "/usr/lib/spark/python/pyspark/", line 1129, in first
     rs = self.take(1)
   File "/usr/lib/spark/python/pyspark/", line 1111, in take
     res = self.context.runJob(self, takeUpToNumLeft, p, True)
   File "/usr/lib/spark/python/pyspark/", line 818, in runJob
     it = self._jvm.PythonRDD.runJob(, mappedRDD._jrdd, 
javaPartitions, allowLocal)
line 538, in __call__
"/usr/lib/spark/python/lib/", line 
300, in get_return_value
     format(target_id, '.', name), value)
Py4JJavaError: An error occurred while calling 
: org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 
in stage 1.0
(TID 7, vps03): org.apache.spark.api.python.PythonException: Traceback 
(most recent call last):
   File "/usr/lib/spark/python/pyspark/", line 107, in main
   File "/usr/lib/spark/python/pyspark/", line 98, in process
     serializer.dump_stream(func(split_index, iterator), outfile)
   File "/usr/lib/spark/python/pyspark/", line 227, in 
     vs = list(itertools.islice(iterator, batch))
   File *"/usr/lib/spark/python/pyspark/", line 1106*, in 
takeUpToNumLeft   <--- *See around line _1106_ of this file in the CDH5 
Spark Distribution*.
     while taken < left:
*ImportError: No module named iter*

 >>> # But *iter()* exists as a built-in (not as a module) ...
 >>> iter(range(10))
<listiterator object at 0x423ff10>

cluster$ rpm -qa | grep -i spark
[ ... ]

Thank you!
Team Prismalytics

