I just stumbled upon this issue as well in Spark 1.6.2 when trying to write my own custom Sink.  For anyone else who runs into this issue, there are two relevant JIRAs that I found, but no solution as of yet:
- https://issues.apache.org/jira/browse/SPARK-14151 - Propose to refactor and expose Metrics Sink and Source interface
- https://issues.apache.org/jira/browse/SPARK-18115 - Custom metrics Sink/Source prevent Executor from starting

On Thu, Sep 8, 2016 at 3:23 PM, map reduced <k3t.git.1@gmail.com> wrote:
Can this be listed as an issue on JIRA?

On Wed, Sep 7, 2016 at 10:19 AM, map reduced <k3t.git.1@gmail.com> wrote:
Thanks for the reply, I wish it did. We have an internal metrics system where we need to submit to. I am sure that the ways I've tried work with yarn deployment, but not with standalone.

Thanks,
KP

On Tue, Sep 6, 2016 at 11:36 PM, Benjamin Kim <bbuild11@gmail.com> wrote:
We use Graphite/Grafana for custom metrics. We found Spark’s metrics not to be customizable. So, we write directly using Graphite’s API, which was very easy to do using Java’s socket library in Scala. It works great for us, and we are going one step further using Sensu to alert us if there is an anomaly in the metrics beyond the norm.

Hope this helps.

Cheers,
Ben


On Sep 6, 2016, at 9:52 PM, map reduced <k3t.git.1@gmail.com> wrote:

Hi, anyone has any ideas please?

On Mon, Sep 5, 2016 at 8:30 PM, map reduced <k3t.git.1@gmail.com> wrote:

Hi,

I've written my custom metrics source/sink for my Spark streaming app and I am trying to initialize it from metrics.properties - but that doesn't work from executors. I don't have control on the machines in Spark cluster, so I can't copy properties file in $SPARK_HOME/conf/ in the cluster. I have it in the fat jar where my app lives, but by the time my fat jar is downloaded on worker nodes in cluster, executors are already started and their Metrics system is already initialized - thus not picking my file with custom source configuration in it.

Following this post, I've specified 'spark.files = metrics.properties' and 'spark.metrics.conf=metrics.properties' but by the time 'metrics.properties' is shipped to executors, their metric system is already initialized.

If I initialize my own metrics system, it's picking up my file but then I'm missing master/executor level metrics/properties (eg. executor.sink.mySink.propName=myProp - can't read 'propName' from 'mySink') since they are initialized by Spark's metric system.

Is there a (programmatic) way to have 'metrics.properties' shipped before executors initialize ?

Here's my SO question.

Thanks,

KP