spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haryani, Akshay" <akshay.hary...@hpe.com>
Subject Re: Get application metric from Spark job
Date Tue, 07 Sep 2021 18:36:28 GMT
For custom metrics, you can take a look at Groupon’s spar metrics: https://github.com/groupon/spark-metrics

It is supported on spark 2.x. Alternatively, you can create a custom source (extending source
trait), enable the sink and register the custom source to get the metrics. Some useful links
for this approach:
https://gist.github.com/ibuenros/9b94736c2bad2f4b8e23
https://kb.databricks.com/metrics/spark-metrics.html
http://mail-archives.us.apache.org/mod_mbox/spark-user/201501.mbox/%3CCAE50=dq+6tdx9VNVM3ctBMWPLDPbUAacO3aN3L8x38zg=xb6VQ@mail.gmail.com%3E

I hope these help.
--
Thanks & Regards,
Akshay Haryani

From: Aurélien Mazoyer <aurelien@aepsilon.com>
Date: Monday, September 6, 2021 at 5:47 AM
To: Haryani, Akshay <akshay.haryani@hpe.com>
Cc: user@spark.apache.org <user@spark.apache.org>
Subject: Re: Get application metric from Spark job
Hi Akshay,

Thank you for your reply. Sounds like a good idea, but I unfortunately have a 2.6 cluster.
Do you know if there would be another solution that would run on 2.6 or if I have no other
choice than migrating to 3?

Regards,

Aurélien

Le jeu. 2 sept. 2021 à 20:12, Haryani, Akshay <akshay.haryani@hpe.com<mailto:akshay.haryani@hpe.com>>
a écrit :
Hi Aurélien,

Spark has endpoints to expose the spark application metrics. These endpoints can be used as
a rest API. You can read more about these here: https://spark.apache.org/docs/3.1.1/monitoring.html#rest-api<https://spark.apache.org/docs/3.1.1/monitoring.html#rest-api>

Additionally,
If you want to build your own custom metrics, you can explore spark custom plugins. Using
a custom plugin, you can track your own custom metrics and plug it into the spark metrics
system. Please note plugins are supported on spark versions above 3.0.


--
Thanks & Regards,
Akshay Haryani

From: Aurélien Mazoyer <aurelien@aepsilon.com<mailto:aurelien@aepsilon.com>>
Date: Thursday, September 2, 2021 at 8:36 AM
To: user@spark.apache.org<mailto:user@spark.apache.org> <user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Get application metric from Spark job
Hi community,

I would like to collect information about the execution of a Spark job while it is running.
Could I define some kind of application metrics (such as a counter that would be incremented
in my code) that I could retrieve regularly while the job is running?

Thank you for help,

Aurelien

Mime
View raw message