spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <ja...@japila.pl>
Subject Re: Not able to write output to local filsystem from Standalone mode.
Date Wed, 25 May 2016 06:51:05 GMT
Hi Mathieu,

Thanks a lot for the answer! I did *not* know it's the driver to
create the directory.

You said "standalone mode", is this the case for the other modes -
yarn and mesos?

p.s. Did you find it in the code or...just experienced before? #curious

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Tue, May 24, 2016 at 4:04 PM, Mathieu Longtin <mathieu@closetwork.org> wrote:
> In standalone mode, executor assume they have access to a shared file
> system. The driver creates the directory and the executor write files, so
> the executors end up not writing anything since there is no local directory.
>
> On Tue, May 24, 2016 at 8:01 AM Stuti Awasthi <stutiawasthi@hcl.com> wrote:
>>
>> hi Jacek,
>>
>> Parent directory already present, its my home directory. Im using Linux
>> (Redhat) machine 64 bit.
>> Also I noticed that "test1" folder is created in my master with
>> subdirectory as "_temporary" which is empty. but on slaves, no such
>> directory is created under /home/stuti.
>>
>> Thanks
>> Stuti
>> ________________________________
>> From: Jacek Laskowski [jacek@japila.pl]
>> Sent: Tuesday, May 24, 2016 5:27 PM
>> To: Stuti Awasthi
>> Cc: user
>> Subject: Re: Not able to write output to local filsystem from Standalone
>> mode.
>>
>> Hi,
>>
>> What happens when you create the parent directory /home/stuti? I think the
>> failure is due to missing parent directories. What's the OS?
>>
>> Jacek
>>
>> On 24 May 2016 11:27 a.m., "Stuti Awasthi" <stutiawasthi@hcl.com> wrote:
>>
>> Hi All,
>>
>> I have 3 nodes Spark 1.6 Standalone mode cluster with 1 Master and 2
>> Slaves. Also Im not having Hadoop as filesystem . Now, Im able to launch
>> shell , read the input file from local filesystem and perform transformation
>> successfully. When I try to write my output in local filesystem path then I
>> receive below error .
>>
>>
>>
>> I tried to search on web and found similar Jira :
>> https://issues.apache.org/jira/browse/SPARK-2984 . Even though it shows
>> resolved for Spark 1.3+ but already people have posted the same issue still
>> persists in latest versions.
>>
>>
>>
>> ERROR
>>
>> scala> data.saveAsTextFile("/home/stuti/test1")
>>
>> 16/05/24 05:03:42 WARN TaskSetManager: Lost task 1.0 in stage 1.0 (TID 2,
>> server1): java.io.IOException: The temporary job-output directory
>> file:/home/stuti/test1/_temporary doesn't exist!
>>
>>         at
>> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250)
>>
>>         at
>> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244)
>>
>>         at
>> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116)
>>
>>         at
>> org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:91)
>>
>>         at
>> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1193)
>>
>>         at
>> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1185)
>>
>>         at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>
>>         at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>
>>         at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>>
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>
>>         at java.lang.Thread.run(Thread.java:745)
>>
>>
>>
>> What is the best way to resolve this issue if suppose I don’t want to have
>> Hadoop installed OR is it mandatory to have Hadoop to write the output from
>> Standalone cluster mode.
>>
>>
>>
>> Please suggest.
>>
>>
>>
>> Thanks &Regards
>>
>> Stuti Awasthi
>>
>>
>>
>>
>>
>> ::DISCLAIMER::
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> The contents of this e-mail and any attachment(s) are confidential and
>> intended for the named recipient(s) only.
>> E-mail transmission is not guaranteed to be secure or error-free as
>> information could be intercepted, corrupted,
>> lost, destroyed, arrive late or incomplete, or may contain viruses in
>> transmission. The e mail and its contents
>> (with or without referred errors) shall therefore not attach any liability
>> on the originator or HCL or its affiliates.
>> Views or opinions, if any, presented in this email are solely those of the
>> author and may not necessarily reflect the
>> views or opinions of HCL or its affiliates. Any form of reproduction,
>> dissemination, copying, disclosure, modification,
>> distribution and / or publication of this message without the prior
>> written consent of authorized representative of
>> HCL is strictly prohibited. If you have received this email in error
>> please delete it and notify the sender immediately.
>> Before opening any email and/or attachments, please check them for viruses
>> and other defects.
>>
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> --------------------------------------------------------------------- To
>> unsubscribe, e-mail: user-unsubscribe@spark.apache.org For additional
>> commands, e-mail: user-help@spark.apache.org
>
> --
> Mathieu Longtin
> 1-514-803-8977

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message