From user-return-21792-apmail-spark-user-archive=spark.apache.org@spark.apache.org Fri Dec 5 05:53:24 2014 Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D3C2610427 for ; Fri, 5 Dec 2014 05:53:24 +0000 (UTC) Received: (qmail 75811 invoked by uid 500); 5 Dec 2014 05:53:22 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 75743 invoked by uid 500); 5 Dec 2014 05:53:22 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 75724 invoked by uid 99); 5 Dec 2014 05:53:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Dec 2014 05:53:22 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jianshi.huang@gmail.com designates 209.85.220.181 as permitted sender) Received: from [209.85.220.181] (HELO mail-vc0-f181.google.com) (209.85.220.181) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Dec 2014 05:53:17 +0000 Received: by mail-vc0-f181.google.com with SMTP id le20so6126vcb.26 for ; Thu, 04 Dec 2014 21:52:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=wg0tLSJiAIN3E/fURh8qqjFPfNw65nMSA4zKuF3La/0=; b=wMzLaoKqSO0HPKNrDvA4D3fmSqu5U2Uh8gVvsS2i0re4x8CeglbWeNqZN9Rqxtd+pq YZ7egePC/Ar7No5ICYyAMBBDyhfMtaUDQbnajZ36JVK1Vp8+z1ImAmiRMOz21tNzYxDr eisFtozb08vF9gq9Mfh3WZk/XNLc+WR/dyTW79m/c0I0nbNeWeJmXAAXYmjn4w4ZHjtI lUwFxpKiWuVg4TcSsOWiKcOf4zhM5Nq5iOu93DvV1qbQJKuDcJIqCNkcrU6pnHXpSxqc rz4FC0kVgb/QuCnlQUGk+BjgCiJy0DARpa66GrJstVEfqZpn5AmmU2m9PQ/EcEMzPASg sdyw== X-Received: by 10.52.231.165 with SMTP id th5mr6416256vdc.48.1417758731905; Thu, 04 Dec 2014 21:52:11 -0800 (PST) MIME-Version: 1.0 Received: by 10.52.59.193 with HTTP; Thu, 4 Dec 2014 21:51:51 -0800 (PST) In-Reply-To: References: From: Jianshi Huang Date: Fri, 5 Dec 2014 13:51:51 +0800 Message-ID: Subject: Re: Exception adding resource files in latest Spark To: "dev@spark.apache.org" , user Content-Type: multipart/alternative; boundary=089e0111d6bc97e652050971ad65 X-Virus-Checked: Checked by ClamAV on apache.org --089e0111d6bc97e652050971ad65 Content-Type: text/plain; charset=UTF-8 I created a ticket for this: https://issues.apache.org/jira/browse/SPARK-4757 Jianshi On Fri, Dec 5, 2014 at 1:31 PM, Jianshi Huang wrote: > Correction: > > According to Liancheng, this hotfix might be the root cause: > > > https://github.com/apache/spark/commit/38cb2c3a36a5c9ead4494cbc3dde008c2f0698ce > > Jianshi > > On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang > wrote: > >> Looks like the datanucleus*.jar shouldn't appear in the hdfs path in >> Yarn-client mode. >> >> Maybe this patch broke yarn-client. >> >> >> https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53 >> >> Jianshi >> >> On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang >> wrote: >> >>> Actually my HADOOP_CLASSPATH has already been set to include >>> /etc/hadoop/conf/* >>> >>> export >>> HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase >>> classpath) >>> >>> Jianshi >>> >>> On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang >>> wrote: >>> >>>> Looks like somehow Spark failed to find the core-site.xml in >>>> /et/hadoop/conf >>>> >>>> I've already set the following env variables: >>>> >>>> export YARN_CONF_DIR=/etc/hadoop/conf >>>> export HADOOP_CONF_DIR=/etc/hadoop/conf >>>> export HBASE_CONF_DIR=/etc/hbase/conf >>>> >>>> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH? >>>> >>>> Jianshi >>>> >>>> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang >>> > wrote: >>>> >>>>> I got the following error during Spark startup (Yarn-client mode): >>>>> >>>>> 14/12/04 19:33:58 INFO Client: Uploading resource >>>>> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar >>>>> -> >>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar >>>>> java.lang.IllegalArgumentException: Wrong FS: >>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar, >>>>> expected: file:/// >>>>> at >>>>> org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643) >>>>> at >>>>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79) >>>>> at >>>>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506) >>>>> at >>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724) >>>>> at >>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501) >>>>> at >>>>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397) >>>>> at >>>>> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67) >>>>> at >>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257) >>>>> at >>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242) >>>>> at scala.Option.foreach(Option.scala:236) >>>>> at >>>>> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242) >>>>> at >>>>> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35) >>>>> at >>>>> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350) >>>>> at >>>>> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35) >>>>> at >>>>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80) >>>>> at >>>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57) >>>>> at >>>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140) >>>>> at org.apache.spark.SparkContext.(SparkContext.scala:335) >>>>> at >>>>> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986) >>>>> at $iwC$$iwC.(:9) >>>>> at $iwC.(:18) >>>>> at (:20) >>>>> at .(:24) >>>>> >>>>> I'm using latest Spark built from master HEAD yesterday. Is this a bug? >>>>> >>>>> -- >>>>> Jianshi Huang >>>>> >>>>> LinkedIn: jianshi >>>>> Twitter: @jshuang >>>>> Github & Blog: http://huangjs.github.com/ >>>>> >>>> >>>> >>>> >>>> -- >>>> Jianshi Huang >>>> >>>> LinkedIn: jianshi >>>> Twitter: @jshuang >>>> Github & Blog: http://huangjs.github.com/ >>>> >>> >>> >>> >>> -- >>> Jianshi Huang >>> >>> LinkedIn: jianshi >>> Twitter: @jshuang >>> Github & Blog: http://huangjs.github.com/ >>> >> >> >> >> -- >> Jianshi Huang >> >> LinkedIn: jianshi >> Twitter: @jshuang >> Github & Blog: http://huangjs.github.com/ >> > > > > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ > -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/ --089e0111d6bc97e652050971ad65 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I created a ticket for this:



Jianshi

On Fri, Dec 5, 2014 at 1:31 PM, Jianshi Huang &l= t;jianshi.huan= g@gmail.com> wrote:
Correction:=C2=A0

According to Liancheng, this= hotfix might be the root cause:


Jianshi

On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang <jianshi= .huang@gmail.com> wrote:
Looks like the datanucleus*.jar shouldn't appear in the = hdfs path in Yarn-client mode.

Maybe this patch broke ya= rn-client.


Jianshi

On Fri, Dec 5, 2= 014 at 12:02 PM, Jianshi Huang <jianshi.huang@gmail.com> wrote:
Actually my HA= DOOP_CLASSPATH has already been set to include /etc/hadoop/conf/*

<= /div>
export HADOOP_CLASSPATH=3D/etc/hbase/conf/hbase-site.xml:/usr/lib= /hbase/lib/hbase-protocol.jar:$(hbase classpath)

Jianshi
On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang = <jianshi.huang@gmail.com> wrote:
Looks like somehow Spark failed to find the = core-site.xml in /et/hadoop/conf

I've already set th= e following env variables:

export YARN_CONF_D= IR=3D/etc/hadoop/conf
export HADOOP_CONF_DIR=3D/etc/hadoop/conf
export HBASE_CONF_DIR=3D/etc/hbase/conf

=
Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?

Jianshi

On Fri, Dec= 5, 2014 at 11:37 AM, Jianshi Huang <jianshi.huang@gmail.com>= wrote:
I got the= following error during Spark startup (Yarn-client mode):

14/12/04 19:33:58 INFO Client: Uploading resource file:/x/home/jians= huang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar -> hdfs://sta= mpy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucl= eus-api-jdo-3.2.6.jar
java.lang.IllegalArgumentException: Wrong F= S: hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_53= 1767/datanucleus-api-jdo-3.2.6.jar, expected: file:///
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.j= ava:643)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.fs.RawL= ocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetF= ileStatus(RawLocalFileSystem.java:506)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(Ra= wLocalFileSystem.java:724)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apa= che.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.fs.FilterFileSy= stem.getFileStatus(FilterFileSystem.java:397)
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 at org.apache.spark.deploy.yarn.ClientDistributedCacheManager.ad= dResource(ClientDistributedCacheManager.scala:67)
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 at org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLo= calResources$5.apply(ClientBase.scala:257)
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 at org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalReso= urces$5.apply(ClientBase.scala:242)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 a= t scala.Option.foreach(Option.scala:236)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(= ClientBase.scala:242)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.s= park.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.spark.deploy.yarn.ClientBase$clas= s.createContainerLaunchContext(ClientBase.scala:350)
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 at org.apache.spark.deploy.yarn.Client.createContainerLau= nchContext(Client.scala:35)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.ap= ache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.spark.scheduler.cluster.YarnClien= tSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.spark.scheduler.TaskSchedulerImpl.st= art(TaskSchedulerImpl.scala:140)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at o= rg.apache.spark.SparkContext.<init>(SparkContext.scala:335)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.spark.repl.SparkILoop.createSpar= kContext(SparkILoop.scala:986)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at $iw= C$$iwC.<init>(<console>:9)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at $iwC.<init>(<console>:18)
=C2=A0 =C2=A0 =C2=A0= =C2=A0 at <init>(<console>:20)
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 at .<init>(<console>:24)

I'= m using latest Spark built from master HEAD yesterday. Is this a bug?
=

--
Jianshi Huang
<= br>LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/



--



--
=
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github = & Blog: http:/= /huangjs.github.com/



--
=
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github = & Blog: http:/= /huangjs.github.com/



--
=
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github = & Blog: http:/= /huangjs.github.com/



--
=
Jianshi Huang

LinkedIn: jianshi
Tw= itter: @jshuang
Github & Blog: http://huangjs.github.com/
--089e0111d6bc97e652050971ad65--