spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashish Mukherjee <>
Subject Spark with data on NFS v HDFS
Date Thu, 05 Mar 2015 13:58:22 GMT

I understand Spark can be used with Hadoop or standalone. I have certain
questions related to use of the correct FS for Spark data.

What is the efficiency trade-off in feeding data to Spark from NFS v HDFS?

If one is not using Hadoop, is it still usual to house data in HDFS for
Spark to read from because of better reliability compared to NFS?

Should data be stored on local FS (not NFS) only for Spark jobs which run
on single machine?


View raw message