spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From JG Perrin <jper...@lumeris.com>
Subject RE: [Spark-Submit] Where to store data files while running job in cluster mode?
Date Fri, 29 Sep 2017 19:00:45 GMT
On a test system, you can also use something like Owncloud/Nextcloud/Dropbox to insure that
the files are synchronized. Would not do it for TB of data ;) ...

-----Original Message-----
From: Jörn Franke [mailto:jornfranke@gmail.com] 
Sent: Friday, September 29, 2017 5:14 AM
To: Gaurav1809 <gauravhpandya@gmail.com>
Cc: user@spark.apache.org
Subject: Re: [Spark-Submit] Where to store data files while running job in cluster mode?

You should use a distributed filesystem such as HDFS. If you want to use the local filesystem
then you have to copy each file to each node.

> On 29. Sep 2017, at 12:05, Gaurav1809 <gauravhpandya@gmail.com> wrote:
> 
> Hi All,
> 
> I have multi node architecture of (1 master,2 workers) Spark cluster, 
> the job runs to read CSV file data and it works fine when run on local 
> mode (Local(*)).
> However, when the same job is ran in cluster mode(Spark://HOST:PORT), 
> it is not able to read it.
> I want to know how to reference the files Or where to store them? 
> Currently the CSV data file is on master(from where the job is submitted).
> 
> Following code works fine in local mode but not in cluster mode.
> 
> val spark = SparkSession
>      .builder()
>      .appName("SampleFlightsApp")
>      .master("spark://masterIP:7077") // change it to 
> .master("local[*]) for local mode
>      .getOrCreate()
> 
>    val flightDF =
> spark.read.option("header",true).csv("/home/username/sampleflightdata")
>    flightDF.printSchema()
> 
> Error: FileNotFoundException: File 
> file:/home/username/sampleflightdata does not exist
> 
> 
> 
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message