spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dale Richardson <dale...@hotmail.com>
Subject Spark config option 'expression language' feedback request
Date Fri, 13 Mar 2015 10:07:57 GMT











PR#4937 ( https://github.com/apache/spark/pull/4937) is a feature to allow for Spark configuration
options (whether on command line, environment variable or a configuration file) to be specified
via a simple expression language.


Such a feature has the following end-user benefits:
- Allows for the flexibility in specifying time intervals or byte quantities in appropriate
and easy to follow units e.g. 1 week rather rather then 604800 seconds

- Allows for the scaling of a configuration option in relation to a system attributes. e.g.

SPARK_WORKER_CORES = numCores - 1

SPARK_WORKER_MEMORY = physicalMemoryBytes - 1.5 GB

- Gives the ability to scale multiple configuration options together eg:

spark.driver.memory = 0.75 * physicalMemoryBytes

spark.driver.maxResultSize = spark.driver.memory * 0.8


The following functions are currently supported by this PR:
NumCores:             Number of cores assigned to the JVM (usually == Physical machine cores)
PhysicalMemoryBytes:  Memory size of hosting machine

JVMTotalMemoryBytes:  Current bytes of memory allocated to the JVM

JVMMaxMemoryBytes:    Maximum number of bytes of memory available to the JVM

JVMFreeMemoryBytes:   maxMemoryBytes - totalMemoryBytes


I was wondering if anybody on the mailing list has any further ideas on other functions that
could be useful to have when specifying spark configuration options?
Regards,Dale.
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message