spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From PengWeiPRC <>
Subject Re: How to handle this situation: Huge File Shared by All maps and Each Computer Has one copy?
Date Thu, 01 May 2014 22:46:06 GMT
Thanks, Rustagi. Yes, the global data is read-only and stays from the
beginning to the end of the whole Spark task. Actually, it is not only
identical for one Map/Reduce task, but used by a lot of map/reduce tasks of
mine. That's why I intend to put the data into each node of my cluster, and
hope to see if it is possible for a Spark Map/Reduce program to let all the
nodes read it simultaneously from their local disks rather than read it by
one node and broadcast to other nodes. Any suggestions for solving it?

View this message in context:
Sent from the Apache Spark User List mailing list archive at

View raw message