spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohamed Lrhazi <Mohamed.Lrh...@georgetown.edu>
Subject PySprak and UnsupportedOperationException
Date Tue, 09 Dec 2014 19:32:22 GMT
While trying simple examples of PySpark code, I systematically get these
failures when I try this.. I dont see any prior exceptions in the output...
How can I debug further to find root cause?


es_rdd = sc.newAPIHadoopRDD(
    inputFormatClass="org.elasticsearch.hadoop.mr.EsInputFormat",
    keyClass="org.apache.hadoop.io.NullWritable",
    valueClass="org.elasticsearch.hadoop.mr.LinkedMapWritable",
    conf={
        "es.resource" : "en_2014/doc",
        "es.nodes":"rap-es2",
        "es.query" :  """{"query":{"match_all":{}},"fields":["title"],
"size": 100}"""
        }
    )


titles=es_rdd.map(lambda d: d[1]['title'][0])
counts = titles.flatMap(lambda x: x.split(' ')).map(lambda x: (x,
1)).reduceByKey(add)


output = counts.collect()



...
14/12/09 19:27:20 INFO BlockManager: Removing broadcast 93
14/12/09 19:27:20 INFO BlockManager: Removing block broadcast_93
14/12/09 19:27:20 INFO MemoryStore: Block broadcast_93 of size 2448 dropped
from memory (free 274984768)
14/12/09 19:27:20 INFO ContextCleaner: Cleaned broadcast 93
14/12/09 19:27:20 INFO BlockManager: Removing broadcast 92
14/12/09 19:27:20 INFO BlockManager: Removing block broadcast_92
14/12/09 19:27:20 INFO MemoryStore: Block broadcast_92 of size 163391
dropped from memory (free 275148159)
14/12/09 19:27:20 INFO ContextCleaner: Cleaned broadcast 92
14/12/09 19:27:20 INFO BlockManager: Removing broadcast 91
14/12/09 19:27:20 INFO BlockManager: Removing block broadcast_91
14/12/09 19:27:20 INFO MemoryStore: Block broadcast_91 of size 163391
dropped from memory (free 275311550)
14/12/09 19:27:20 INFO ContextCleaner: Cleaned broadcast 91
14/12/09 19:27:30 ERROR Executor: Exception in task 0.0 in stage 67.0 (TID
72)
java.lang.UnsupportedOperationException
        at java.util.AbstractMap.put(AbstractMap.java:203)
        at java.util.AbstractMap.putAll(AbstractMap.java:273)
        at
org.elasticsearch.hadoop.mr.EsInputFormat$WritableShardRecordReader.setCurrentValue(EsInputFormat.java:373)
        at
org.elasticsearch.hadoop.mr.EsInputFormat$WritableShardRecordReader.setCurrentValue(EsInputFormat.java:322)
        at
org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.next(EsInputFormat.java:299)
        at
org.elasticsearch.hadoop.mr.EsInputFormat$ShardRecordReader.nextKeyValue(EsInputFormat.java:227)
        at
org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:138)
        at
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
        at
scala.collection.Iterator$GroupedIterator.takeDestructively(Iterator.scala:913)
        at scala.collection.Iterator$GroupedIterator.go(Iterator.scala:929)
        at
scala.collection.Iterator$GroupedIterator.fill(Iterator.scala:969)
        at
scala.collection.Iterator$GroupedIterator.hasNext(Iterator.scala:972)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
        at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:350)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        at
org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:339)
        at
org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$1.apply$mcV$sp(PythonRDD.scala:209)
        at
org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$1.apply(PythonRDD.scala:184)
        at
org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$1.apply(PythonRDD.scala:184)
        at
org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1364)
        at
org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:183)
14/12/09 19:27:30 INFO TaskSetManager: Starting task 2.0 in stage 67.0 (TID
74, localhost, ANY, 26266 bytes)
14/12/09 19:27:30 INFO Executor: Running task 2.0 in stage 67.0 (TID 74)
14/12/09 19:27:30 WARN TaskSetManager: Lost task 0.0 in stage 67.0 (TID 72,
localhost): java.lang.UnsupportedOperationException:

Mime
View raw message