spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bin Fan <fanbin...@gmail.com>
Subject Re: How shall I configure the Spark executor memory size and the Alluxio worker memory size on a machine?
Date Fri, 05 Apr 2019 04:29:28 GMT
Hi Andy,

It really depends on your workloads. I would suggest to allocate 20% of the
size of your input data set
as the starting point and see how it works.

Also depending on your data source as the under store of Alluxio, if it is
remote (e.g., cloud storage like S3 or GCS),
you can perhaps use Alluxio to manage local disk or SSD storage resource
rather than memory resource.
In this case, the "local Alluxio storage" is still much faster compared to
reading from remote storage.
Check out the documentation on tiered storage configuration here (
http://www.alluxio.org/docs/1.8/en/advanced/Alluxio-Storage-Management.html#configuring-alluxio-storage
)

- Bin

On Thu, Mar 21, 2019 at 8:26 AM u9g <lwx371423@163.com> wrote:

> Hey,
>
> We have a cluster of 10 nodes each of which consists 128GB memory. We are
> about to running Spark and Alluxio on the cluster.  We wonder how shall
> allocate the memory to the Spark executor and the Alluxio worker on a
> machine? Are there some recommendations? Thanks!
>
> Best,
> Andy Li
>
>
>
>

Mime
View raw message