spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vipul Pandey <vipan...@gmail.com>
Subject Re: Spark (trunk/yarn) on CDH4.3.0.2 - YARN
Date Mon, 09 Sep 2013 17:32:08 GMT
Thanks for the tip - I'm building off of the master and against CDH4.3.0
now (my cluster is CDH4.3.0.2) - apache hadoop version is hadoop2.0.0
http://www.cloudera.com/content/cloudera-content/cloudera-docs/PkgVer/3.25.2013/CDH-Version-and-Packaging-Information/cdhvd_topic_3_1.html

After following the instructions on the doc below, here's what I found :

- SPARK_HADOOP_VERSION=2.0.0-cdh4.3.0 SPARK_YARN=true ./sbt/sbt assembly
This results in *module not found :
org.apache.hadoop#hadoop-client;2.0.0-mr2-cdh4.3.0.2*
with below as one of the warning messages
[warn] ==== Cloudera Repository: tried
[warn]   *http*://
repository.cloudera.com/artifactory/cloudera-repos/org/apache/hadoop/hadoop-client/2.0.0-mr2-cdh4.3.0.2/hadoop-client-2.0.0-mr2-cdh4.3.0.2.pom

I realized that they have made their repository secure now so http won't
work. Changing it to *https* in SparkBuild.scala helps. Someone may want to
make that change and check in.

Also, Executing the assembly command above does not generate the example
jars as mentioned in the directions. I had to run sbt package to get that
jar and rerun the assembly.

I was able to run the example just fine.


Now, the next question. How should I initialize my SparkContext for Yarn  :
This is what I had with the standalone mode -
    val sc = new SparkContext("spark://a.b.c:7077", "indexXformation", "",
Seq());
Do i do something here? or will the client pick up the yarn configurations
from the hadoop config?

Vipul



On Fri, Sep 6, 2013 at 4:30 PM, Tom Graves <tgraves_cs@yahoo.com> wrote:

> Which spark branch are you building off of?
> If using master branch follow the directions here:
> https://github.com/mesos/spark/blob/master/docs/running-on-yarn.md
>
> Make sure to set your Hadoop version to CDh.
>
> I'm not sure what the CDh versions map to in regular apache Hadoop but if
> its newer then the apache hadoop 2.0.5-alpha then they changed
>  yarn Apis so it won't work without changes to the app master.
>
> Tom
>
> On Sep 6, 2013, at 5:37 PM, Vipul Pandey <vipandey@gmail.com> wrote:
>
> I'm unable to successfully run the SparkPi example in my YARN cluster.
>
> I did whatever has been specified here (didn't change anything anywhere) : http://spark.incubator.apache.org/docs/0.7.0/running-on-yarn.html
>
> and added HADOOP_CONF_DIR as well. (btw, on sbt/sbt assembly - the jar file it generates
is spark-core-assembly-*0.6.0*.jar)
>
>
> I get the following exception in my container :
>
>
> Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
> 	at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
> 	at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
> 	at spark.deploy.yarn.ApplicationMaster.registerApplicationMaster(ApplicationMaster.scala:123)
> 	at spark.deploy.yarn.ApplicationMaster.spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:52)
> 	at spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:42)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
> 	at spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:40)
> 	at spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:340)
> 	at spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
> Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local
exception: com.google.protobuf.InvalidProtocolBufferException: Message missing required fields:
callId, status; Host Details : local host is: "rd17d01ls-vm0109.rd.geo.apple.com/17.134.172.65";
destination host is: "rd17d01ls-vm0110.rd.geo.apple.com":8030;
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:212)
> 	at $Proxy7.registerApplicationMaster(Unknown Source)
> 	at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:100)
> 	... 9 more
> Caused by: java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException:
Message missing required fields: callId, status; Host Details : local host is: "rd17d01ls-vm0109.rd.geo.apple.com/17.134.172.65";
destination host is: "rd17d01ls-vm0110.rd.geo.apple.com":8030;
> 	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:761)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1239)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
> 	... 11 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Message missing required
fields: callId, status
> 	at com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:81)
> 	at org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto$Builder.buildParsed(RpcPayloadHeaderProtos.java:1094)
> 	at org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto$Builder.access$1300(RpcPayloadHeaderProtos.java:1028)
> 	at org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:986)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:946)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:844)
>
>
>
>
>
> Any solutions anyone?
>
>

Mime
View raw message