spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From james <yiaz...@gmail.com>
Subject Re: Came across Spark SQL hang/Error issue with Spark 1.5 Tungsten feature
Date Mon, 03 Aug 2015 11:35:04 GMT
Based on the latest spark code(commit
608353c8e8e50461fafff91a2c885dca8af3aaa8) and used the same Spark SQL query
to test two group of combined configuration and seemed that currently it
don't work fine in "tungsten-sort" shuffle manager from below results:

*Test 1# (PASSED)*
spark.shuffle.manager=sort
spark.sql.codegen=true
spark.sql.unsafe.enabled=true 

*Test 2#(FAILED)*
spark.shuffle.manager=tungsten-sort
spark.sql.codegen=true
spark.sql.unsafe.enabled=true 

15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send
map output locations for shuffle 3 to bignode4:50313
15/08/03 16:46:02 INFO spark.MapOutputTrackerMaster: Size of output statuses
for shuffle 3 is 586 bytes
15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send
map output locations for shuffle 3 to bignode2:60490
15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send
map output locations for shuffle 3 to bignode2:56319
15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send
map output locations for shuffle 3 to bignode1:58179
15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send
map output locations for shuffle 3 to bignode1:32816
15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send
map output locations for shuffle 3 to bignode3:55840
15/08/03 16:46:02 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send
map output locations for shuffle 3 to bignode3:46874
15/08/03 16:46:02 WARN scheduler.TaskSetManager: Lost task 42.0 in stage
158.0 (TID 1548, bignode4): java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at
org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:118)
        at
org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$3$$anon$1.next(UnsafeRowSerializer.scala:107)
        at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at
org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:30)
        at
org.apache.spark.InterruptibleIterator.next(InterruptibleIterator.scala:43)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at
org.apache.spark.sql.execution.UnsafeExternalRowSorter.sort(UnsafeExternalRowSorter.java:167)
        at
org.apache.spark.sql.execution.TungstenSort$$anonfun$doExecute$3.apply(sort.scala:140)
        at
org.apache.spark.sql.execution.TungstenSort$$anonfun$doExecute$3.apply(sort.scala:120)
        at
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686)
        at
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686)
        at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
        at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:71)
        at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:86)
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Came-across-Spark-SQL-hang-Error-issue-with-Spark-1-5-Tungsten-feature-tp13537p13563.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message