I wonder whether any of the file systems supported by Spark, may well support a replication level whereby each node has a full copy of the data. 
I realize this was not the main intended scenario of spark/hadoop, but may be a good fit for a compute cluster that needs to be very fast over its input data, and that has data only in the amount of few terabytes in total (which fit nicely on any commodity disk and soon on any SSD).

It would be nice to use Spark map-reduce over the data, and enjoy automatic replication.

It would be also nice to assume Spark can seamlessly manage a job's workflow across such cluster...