spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pagliari, Roberto" <>
Subject RE: Spark SQL configuration
Date Mon, 27 Oct 2014 04:16:12 GMT
What is yarn cluster?

And, does spark necessarily need Hadoop already installed in the cluster? For example, can
one download spark and run it on a bunch of nodes, with no prior installation of Hadoop?


From: Yi Tian []
Sent: Sunday, October 26, 2014 9:08 PM
To: Pagliari, Roberto
Subject: Re: Spark SQL configuration

You can write `HADOOP_CONF_DIR=your_hadoop_conf_path` to `conf/` to enable:

1 connect to your yarn cluster
2 set `hdfs` as default FileSystem, otherwise you have to write “hdfs://“ before every
paths you defined, like: `val input = sc.textFile(“hdfs://user/spark/test.dat”)`

Best Regards,

Yi Tian<>

On Oct 27, 2014, at 07:59, Pagliari, Roberto <<>>

I’m a newbie with Spark. After installing it on all the machines I want to use, do I need
to tell it about Hadoop configuration, or will it be able to find it himself?

Thank you,

View raw message