spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yinan Li <liyinan...@gmail.com>
Subject Re: spark-submit exit status on k8s
Date Sun, 05 Apr 2020 19:48:35 GMT
Not sure if you are aware of this new feature in Airflow
https://issues.apache.org/jira/browse/AIRFLOW-6542. It's a way to use
Airflow to orchestrate spark applications run using the Spark K8S operator (
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator).


On Sun, Apr 5, 2020 at 8:25 AM Masood Krohy <masood.krohy@analytical.works>
wrote:

> Another, simpler solution that I just thought of: just add an operation at
> the end of your Spark program to write an empty file somewhere, with
> filename SUCCESS for example. Add a stage to your AirFlow graph to check
> the existence of this file after running spark-submit. If the file is
> absent, then the Spark app must have failed.
>
> The above should work if you want to avoid dealing with the REST API for
> monitoring.
>
> Masood
>
> __________________
>
> Masood Krohy, Ph.D.
> Data Science Advisor|Platform Architecthttps://www.analytical.works
>
> On 4/4/20 10:54 AM, Masood Krohy wrote:
>
> I'm not in the Spark dev team, so cannot tell you why that priority was
> chosen for the JIRA issue or if anyone is about to finish the work on that;
> I'll let others jump in if they know.
>
> Just wanted to offer a potential solution so that you can move ahead in
> the meantime.
>
> Masood
>
> __________________
>
> Masood Krohy, Ph.D.
> Data Science Advisor|Platform Architecthttps://www.analytical.works
>
> On 4/4/20 7:49 AM, Marshall Markham wrote:
>
> Thank you very much Masood for your fast response. Last question, is the
> current status in Jira representative of the status of the ticket within
> the project team? This seems like a big deal for the K8s implementation and
> we were surprised to find it marked as priority low. Is there any
> discussion of picking up this work in the near future?
>
>
>
> Thanks,
>
> Marshall
>
>
>
> *From:* Masood Krohy <masood.krohy@analytical.works>
> <masood.krohy@analytical.works>
> *Sent:* Friday, April 3, 2020 9:34 PM
> *To:* Marshall Markham <mmarkham@precisionlender.com>
> <mmarkham@precisionlender.com>; user <user@spark.apache.org>
> <user@spark.apache.org>
> *Subject:* Re: spark-submit exit status on k8s
>
>
>
> While you wait for a fix on that JIRA ticket, you may be able to add an
> intermediary step in your AirFlow graph, calling Spark's REST API after
> submitting the job, and dig into the actual status of the application, and
> make a success/fail decision accordingly. You can make repeated calls in a
> loop to the REST API with few seconds delay between each call while the
> execution is in progress until the application fails or succeeds.
>
> https://spark.apache.org/docs/latest/monitoring.html#rest-api
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fmonitoring.html%23rest-api&data=02%7C01%7Cmmarkham%40precisionlender.com%7C5de463febcd142287ba208d7d8384f1c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C1%7C637215608668550345&sdata=VeYtrGQ2yfkYvxuEvqgaTVoTf2ap5krWlmtR8OJBcr0%3D&reserved=0>
>
> Hope this helps.
>
> Masood
>
> __________________
>
>
>
> Masood Krohy, Ph.D.
>
> Data Science Advisor|Platform Architect
>
> https://www.analytical.works <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.analytical.works%2F&data=02%7C01%7Cmmarkham%40precisionlender.com%7C5de463febcd142287ba208d7d8384f1c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C1%7C637215608668550345&sdata=1e07VVnMzpaUTR4ppvZxY5XCEcfRzCX7gA6YgdlWWaU%3D&reserved=0>
>
> On 4/3/20 8:23 AM, Marshall Markham wrote:
>
> Hi Team,
>
>
>
> My team recently conducted a POC of Kubernetes/Airflow/Spark with great
> success. The major concern we have about this system, after the completion
> of our POC is a behavior of spark-submit. When called with a Kubernetes API
> endpoint as master spark-submit seems to always return exit status 0. This
> is obviously a major issue preventing us from conditioning job graphs on
> the success or failure of our Spark jobs. I found Jira ticket SPARK-27697
> under the Apache issues covering this bug. The ticket is listed as minor
> and does not seem to have any activity recently. I would like to up vote it
> and ask if there is anything I can do to move this forward. This could be
> the one thing standing between my team and our preferred batch workload
> implementation. Thank you.
>
>
>
> *Marshall Markham*
>
> Data Engineer
>
> PrecisionLender, a Q2 Company
>
>
>
> NOTE: This communication and any attachments are for the sole use of the
> intended recipient(s) and may contain confidential and/or privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by replying to this email, and destroy all copies of the original
> message.
>
> NOTE: This communication and any attachments are for the sole use of the
> intended recipient(s) and may contain confidential and/or privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by replying to this email, and destroy all copies of the original
> message.
>
>

Mime
View raw message