spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Baptiste Onofré ...@nanthrax.net>
Subject Re: Spark shuffle service does not work in stand alone
Date Tue, 13 Oct 2015 14:31:55 GMT
Hi,

AFAIK, the shuffle service makes sense only to delegate the shuffle to 
mapreduce (as mapreduce shuffle is most of the time faster than the 
spark shuffle).
As you run in standalone mode, shuffle service will use the spark shuffle.

Not 100% thought.

Regards
JB

On 10/13/2015 04:23 PM, Saif.A.Ellafi@wellsfargo.com wrote:
> Has anyone tried shuffle service in Stand Alone cluster mode? I want to
> enable it for d.a. but my jobs never start when I submit them.
> This happens with all my jobs.
> 15/10/13 08:29:45 INFO DAGScheduler: Job 0 failed: json at
> DataLoader.scala:86, took 16.318615 s
> Exception in thread "main" org.apache.spark.SparkException: Job aborted
> due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent
> failure: Lost task 0.3 in stage 0.0 (TID 7, 162.101.194.47):
> ExecutorLostFailure (executor 4 lost)
> Driver stacktrace:
>          at
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
>          at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
>          at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
>          at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>          at
> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>          at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
>          at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>          at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>          at scala.Option.foreach(Option.scala:236)
>          at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
>          at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
>          at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
>          at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
>          at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>          at
> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
>          at org.apache.spark.SparkContext.runJob(SparkContext.scala:1822)
>          at org.apache.spark.SparkContext.runJob(SparkContext.scala:1942)
>          at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:1003)
>          at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>          at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
>          at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
>          at org.apache.spark.rdd.RDD.reduce(RDD.scala:985)
>          at
> org.apache.spark.rdd.RDD$$anonfun$treeAggregate$1.apply(RDD.scala:1114)
>          at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>          at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
>          at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
>          at org.apache.spark.rdd.RDD.treeAggregate(RDD.scala:1091)
>          at
> org.apache.spark.sql.execution.datasources.json.InferSchema$.apply(InferSchema.scala:58)
>          at
> org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$6.apply(JSONRelation.scala:105)
>          at
> org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$6.apply(JSONRelation.scala:100)
>          at scala.Option.getOrElse(Option.scala:120)
>          at
> org.apache.spark.sql.execution.datasources.json.JSONRelation.dataSchema$lzycompute(JSONRelation.scala:100)
>          at
> org.apache.spark.sql.execution.datasources.json.JSONRelation.dataSchema(JSONRelation.scala:99)
>          at
> org.apache.spark.sql.sources.HadoopFsRelation.schema$lzycompute(interfaces.scala:561)
>          at
> org.apache.spark.sql.sources.HadoopFsRelation.schema(interfaces.scala:560)
>          at
> org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:31)
>          at
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120)
>          at
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:104)
>          at
> org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:219)
>          at
> org.apache.saif.loaders.DataLoader$.load_json(DataLoader.scala:86)

-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message