spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathieu Longtin <math...@closetwork.org>
Subject Re: Not able to write output to local filsystem from Standalone mode.
Date Wed, 25 May 2016 18:23:02 GMT
Experience. I don't use Mesos or Yarn or Hadoop, so I don't know.


On Wed, May 25, 2016 at 2:51 AM Jacek Laskowski <jacek@japila.pl> wrote:

> Hi Mathieu,
>
> Thanks a lot for the answer! I did *not* know it's the driver to
> create the directory.
>
> You said "standalone mode", is this the case for the other modes -
> yarn and mesos?
>
> p.s. Did you find it in the code or...just experienced before? #curious
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Tue, May 24, 2016 at 4:04 PM, Mathieu Longtin <mathieu@closetwork.org>
> wrote:
> > In standalone mode, executor assume they have access to a shared file
> > system. The driver creates the directory and the executor write files, so
> > the executors end up not writing anything since there is no local
> directory.
> >
> > On Tue, May 24, 2016 at 8:01 AM Stuti Awasthi <stutiawasthi@hcl.com>
> wrote:
> >>
> >> hi Jacek,
> >>
> >> Parent directory already present, its my home directory. Im using Linux
> >> (Redhat) machine 64 bit.
> >> Also I noticed that "test1" folder is created in my master with
> >> subdirectory as "_temporary" which is empty. but on slaves, no such
> >> directory is created under /home/stuti.
> >>
> >> Thanks
> >> Stuti
> >> ________________________________
> >> From: Jacek Laskowski [jacek@japila.pl]
> >> Sent: Tuesday, May 24, 2016 5:27 PM
> >> To: Stuti Awasthi
> >> Cc: user
> >> Subject: Re: Not able to write output to local filsystem from Standalone
> >> mode.
> >>
> >> Hi,
> >>
> >> What happens when you create the parent directory /home/stuti? I think
> the
> >> failure is due to missing parent directories. What's the OS?
> >>
> >> Jacek
> >>
> >> On 24 May 2016 11:27 a.m., "Stuti Awasthi" <stutiawasthi@hcl.com>
> wrote:
> >>
> >> Hi All,
> >>
> >> I have 3 nodes Spark 1.6 Standalone mode cluster with 1 Master and 2
> >> Slaves. Also Im not having Hadoop as filesystem . Now, Im able to launch
> >> shell , read the input file from local filesystem and perform
> transformation
> >> successfully. When I try to write my output in local filesystem path
> then I
> >> receive below error .
> >>
> >>
> >>
> >> I tried to search on web and found similar Jira :
> >> https://issues.apache.org/jira/browse/SPARK-2984 . Even though it shows
> >> resolved for Spark 1.3+ but already people have posted the same issue
> still
> >> persists in latest versions.
> >>
> >>
> >>
> >> ERROR
> >>
> >> scala> data.saveAsTextFile("/home/stuti/test1")
> >>
> >> 16/05/24 05:03:42 WARN TaskSetManager: Lost task 1.0 in stage 1.0 (TID
> 2,
> >> server1): java.io.IOException: The temporary job-output directory
> >> file:/home/stuti/test1/_temporary doesn't exist!
> >>
> >>         at
> >>
> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250)
> >>
> >>         at
> >>
> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244)
> >>
> >>         at
> >>
> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116)
> >>
> >>         at
> >> org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:91)
> >>
> >>         at
> >>
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1193)
> >>
> >>         at
> >>
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1185)
> >>
> >>         at
> >> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> >>
> >>         at org.apache.spark.scheduler.Task.run(Task.scala:89)
> >>
> >>         at
> >> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> >>
> >>         at
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >>
> >>         at
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>
> >>         at java.lang.Thread.run(Thread.java:745)
> >>
> >>
> >>
> >> What is the best way to resolve this issue if suppose I don’t want to
> have
> >> Hadoop installed OR is it mandatory to have Hadoop to write the output
> from
> >> Standalone cluster mode.
> >>
> >>
> >>
> >> Please suggest.
> >>
> >>
> >>
> >> Thanks &Regards
> >>
> >> Stuti Awasthi
> >>
> >>
> >>
> >>
> >>
> >> ::DISCLAIMER::
> >>
> >>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
> >>
> >> The contents of this e-mail and any attachment(s) are confidential and
> >> intended for the named recipient(s) only.
> >> E-mail transmission is not guaranteed to be secure or error-free as
> >> information could be intercepted, corrupted,
> >> lost, destroyed, arrive late or incomplete, or may contain viruses in
> >> transmission. The e mail and its contents
> >> (with or without referred errors) shall therefore not attach any
> liability
> >> on the originator or HCL or its affiliates.
> >> Views or opinions, if any, presented in this email are solely those of
> the
> >> author and may not necessarily reflect the
> >> views or opinions of HCL or its affiliates. Any form of reproduction,
> >> dissemination, copying, disclosure, modification,
> >> distribution and / or publication of this message without the prior
> >> written consent of authorized representative of
> >> HCL is strictly prohibited. If you have received this email in error
> >> please delete it and notify the sender immediately.
> >> Before opening any email and/or attachments, please check them for
> viruses
> >> and other defects.
> >>
> >>
> >>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
> >>
> >> --------------------------------------------------------------------- To
> >> unsubscribe, e-mail: user-unsubscribe@spark.apache.org For additional
> >> commands, e-mail: user-help@spark.apache.org
> >
> > --
> > Mathieu Longtin
> > 1-514-803-8977
>
-- 
Mathieu Longtin
1-514-803-8977

Mime
View raw message