spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vipul Pandey <vipan...@gmail.com>
Subject Re: job failing in standalone mode
Date Fri, 13 Sep 2013 23:28:33 GMT
btw, here's what I'm doing : 
- I have my spark jar file in the lib directory of my project
- I build my project with sbt  (sbt/sbt package)
- then I run : sbt/sbt run

here's what my main method looks like : 


  def main(args: Array[String]) {
        println("Started")

    	val sc = new SparkContext("spark:/</MASTER_IP>:7077", "indexXformation", "", Seq());

   	 val textFile = sc.textFile("hdfs://abc.xyz.com/user/vipul/index/part-r-00298");
        println("textFile")
	
	// it's of the form (A	B	C	{x,y,z ....})
	//i'm converting raw text to some object structure "Index" 
	// I need to then use it late for various things - but for testing out standalone mode --
i'm just writing the objects back to the file. 

    	val index = textFile.map(_.split('\t')).map(line => (line(0), new Index(line(0), line(1),
line(2), toSeq(line(3)))))
        println("index")  // THIS PRINTS
        println(index.first) // FIRST ELEMENT IS PRINTED AS WELL 
        println(index.count) // IT FAILS RIGHT HERE ... and fails "fast"

    index.saveAsTextFile("hdfs://abc.xyz.com/user/midas//new_index_sa2");
        println("written");

  }


btw, the same steps work just fine on Spark Shell. 

~Vipul




On Sep 13, 2013, at 11:59 AM, Vipul Pandey <vipandey@gmail.com> wrote:

> Didn't know about the location of those logs - so thanks for pointing out.  But no, there's
nothing useful in them. 
> 
> here's an example
> 
> log4j:WARN No appenders could be found for logger (akka.event.slf4j.Slf4jEventHandler).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
> Spark Executor Command: "java" "-cp" ":/opt/geo/midas/spark/spark/conf:/opt/geo/midas/spark/spark/assembly/target/scala-2.9.3/spark-assembly-0.8.0-SNAPSHOT-hadoop2.0.0-cdh4.3.0.jar"
"-Xms512M" "-Xmx512M" "org.apache.spark.executor.StandaloneExecutorBackend" "akka://spark@rd..abc.xyz:60874/user/StandaloneScheduler"
"3" "rd.abc.xyz.com" "4"
> 
> 
> I'm able to write to hdfs from the spark-shell just fine. it's only this separate application
mode that's not working
> 
> 
> 
> 
> 
> On Sep 13, 2013, at 10:52 AM, Matei Zaharia <matei.zaharia@gmail.com> wrote:
> 
>> Have you looked at the stdout and stderr files created for the job on the worker
nodes? By default they're in the "work" directory under SPARK_HOME.
>> 
>> In my experience this either means no write permissions to the filesystem, or no
Java found.
>> 
>> Matei
>> 
>> On Sep 12, 2013, at 10:59 PM, Vipul Pandey <vipandey@gmail.com> wrote:
>> 
>>> - Master Branch
>>> - Standalone Mode
>>> 
>>> I'm able to run the some basic commands on the Spark Shell. But when I package
the same commands in a scala object in my app and run it separately - it just fails within
a second without giving any reasons. 
>>> 
>>> This is what I see in the master logs : 
>>> 
>>> 13/09/12 22:20:22 INFO Master: Registering app indexXformation
>>> 13/09/12 22:20:22 INFO Master: Registered app indexXformation with ID app-20130912222022-0010
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/0 on
worker worker-20130911184654.xyz.abc..com-57772
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/1 on
worker worker-20130912180752-abc-vm0105.xyz.com-43175
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/2 on
worker worker-20130911184654-abc-vm0108.xyz.com-39247
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/3 on
worker worker-20130912183838-abc-vm0105.xyz.com-43175
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/4 on
worker worker-20130911184654-abc-vm0109.xyz.com-57730
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/5 on
worker worker-20130911184654-abc-vm0105.xyz.com-43175
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/6 on
worker worker-20130911184654-abc-vm0106.xyz.com-43044
>>> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/0 because
it is FAILED
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/7 on
worker worker-20130911184654-abc-vm0107.xyz.com-57772
>>> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/1 because
it is FAILED
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/8 on
worker worker-20130912180752-abc-vm0105.xyz.com-43175
>>> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/2 because
it is FAILED
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/9 on
worker worker-20130911184654-abc-vm0108.xyz.com-39247
>>> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/4 because
it is FAILED
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/10
on worker worker-20130911184654-abc-vm0109.xyz.com-57730
>>> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/3 because
it is FAILED
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/11
on worker worker-20130912183838-abc-vm0105.xyz.com-43175
>>> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/6 because
it is FAILED
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/12
on worker worker-20130911184654-abc-vm0106.xyz.com-43044
>>> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/7 because
it is FAILED
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/13
on worker worker-20130911184654-abc-vm0107.xyz.com-57772
>>> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/5 because
it is FAILED
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/14
on worker worker-20130911184654-abc-vm0105.xyz.com-43175
>>> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/8 because
it is FAILED
>>> 13/09/12 22:20:22 INFO Master: Launching executor app-20130912222022-0010/15
on worker worker-20130912180752-abc-vm0105.xyz.com-43175
>>> 13/09/12 22:20:22 INFO Master: Removing executor app-20130912222022-0010/9 because
it is FAILED
>>> 13/09/12 22:20:22 ERROR Master: Application indexXformation with ID app-20130912222022-0010
failed 10 times, removing it
>>> 13/09/12 22:20:22 INFO Master: Removing app app-20130912222022-0010
>>> 
>>> 
>>> and the slave logs say nothing at all. 
>>> 
>>> I read lines from a file and transform them in a different form. The "transformedRDD".first
runs just fine and prints out the first value but RDD.count just fails without any reasons.

>>> I'm unable to find out why. I'm deploying the correct spark jar  file in my project
as well. 
>>> Any clues anyone?	
>>> again, this is the master branch and in standalone mode. 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 


Mime
View raw message