The spark logging can be set for different purposes. Say for example if you want to control the spark-submit log, “log4j.logger.org.apache.spark.repl.Main=WARN/INFO/ERROR” can be set.
Similarly, to control third party logs:
log4j.logger.org.spark_project.jetty=<LEVEL>, log4j.logger.org.apache.parquet=<LEVEL> etc..
These properties can be set in the conf/log4j .properties file.
Hope this helps! 😊
From: Deepak Sharma <email@example.com>
Sent: Thursday, February 14, 2019 12:10 PM
To: spark users <firstname.lastname@example.org>
Subject: Spark streaming filling the disk with logs
I am running a spark streaming job with below configuration :
But it’s still filling the disk with info logs.
If the logging level is set to WARN at cluster level , then only the WARN logs are getting written but then it affects all the jobs .
Is there any way to get rid of INFO level of logging at spark streaming job level ?