spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luca Borin <>
Subject Apache Spark Log4j logging applicationId
Date Wed, 24 Jul 2019 05:05:48 GMT

I would like to add the applicationId to all logs produced by Spark through
Log4j. Consider that I have a cluster with several jobs running in it, so
the presence of the applicationId would be useful to logically divide them.

I have found a partial solution. If I change the layout of the
PatternLayout logger, I can add the print of the ThreadContext (see here
<>), which
can be used to add through MDC the information of the applicationId (see
This works for the driver, but I would like to add this information at
Spark application startup, both for driver and workers. Notice that I'm
working with a managed environment (Databricks), so I'm partially limited
in cluster management. One workaround to execute the put of the parameter
through MDC to all workers is to use a broadcast variable and perform an
action with it, but I don't think it is stable, considering that this
should work also if the worker machine restarts or is substituted.

Thank you

View raw message