Hello everyone,

I wanted to ask what's the state of support of Spark dynamic allocation as of now, if there's any issue where I could track its advancement and missing features.

We've just started evaluating possible alternatives for a production architectural setup for our use case, and dynamic allocation could be useful since our processing batches have a moderate variance in terms of number of processed objects during the lifetime of the application. Hence, we'd like to see if K8s may fit as a Cluster Manager.

Our test environment is an Hadoop cluster (HDP 3.0, used because we had it already around), but since Hadoop/HDFS is not a hard requirement, I'd like to ask what's considered the best cluster manager: why should we use a standalone cluster, wrt a YARN or MESOS cluster? I mean, obviously, if we already had one of those clusters production-ready, the choice would be easier, but starting from scratch what are the pros and cons of the various spark-compatible alternatives.

Possibly, I'd like to ask if there's anyone who's had experience running Spark on a public cloud (AWS, Azure etc.) and whether their experience included Hadoop PaaS (such as EMR and HDInsight), full IaaS, any K8s aaS (AKS, EKS etc.).

Thank you very much for your time,
Federico