spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Masood Krohy <masood.kr...@analytical.works>
Subject Re: spark-submit exit status on k8s
Date Sun, 05 Apr 2020 15:24:54 GMT
Another, simpler solution that I just thought of: just add an operation 
at the end of your Spark program to write an empty file somewhere, with 
filename SUCCESS for example. Add a stage to your AirFlow graph to check 
the existence of this file after running spark-submit. If the file is 
absent, then the Spark app must have failed.

The above should work if you want to avoid dealing with the REST API for 
monitoring.

Masood

__________________

Masood Krohy, Ph.D.
Data Science Advisor|Platform Architect
https://www.analytical.works

On 4/4/20 10:54 AM, Masood Krohy wrote:
>
> I'm not in the Spark dev team, so cannot tell you why that priority 
> was chosen for the JIRA issue or if anyone is about to finish the work 
> on that; I'll let others jump in if they know.
>
> Just wanted to offer a potential solution so that you can move ahead 
> in the meantime.
>
> Masood
>
> __________________
>
> Masood Krohy, Ph.D.
> Data Science Advisor|Platform Architect
> https://www.analytical.works
> On 4/4/20 7:49 AM, Marshall Markham wrote:
>>
>> Thank you very much Masood for your fast response. Last question, is 
>> the current status in Jira representative of the status of the ticket 
>> within the project team? This seems like a big deal for the K8s 
>> implementation and we were surprised to find it marked as priority 
>> low. Is there any discussion of picking up this work in the near future?
>>
>> Thanks,
>>
>> Marshall
>>
>> *From:*Masood Krohy <masood.krohy@analytical.works>
>> *Sent:* Friday, April 3, 2020 9:34 PM
>> *To:* Marshall Markham <mmarkham@precisionlender.com>; user 
>> <user@spark.apache.org>
>> *Subject:* Re: spark-submit exit status on k8s
>>
>> While you wait for a fix on that JIRA ticket, you may be able to add 
>> an intermediary step in your AirFlow graph, calling Spark's REST API 
>> after submitting the job, and dig into the actual status of the 
>> application, and make a success/fail decision accordingly. You can 
>> make repeated calls in a loop to the REST API with few seconds delay 
>> between each call while the execution is in progress until the 
>> application fails or succeeds.
>>
>> https://spark.apache.org/docs/latest/monitoring.html#rest-api 
>> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fmonitoring.html%23rest-api&data=02%7C01%7Cmmarkham%40precisionlender.com%7C5de463febcd142287ba208d7d8384f1c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C1%7C637215608668550345&sdata=VeYtrGQ2yfkYvxuEvqgaTVoTf2ap5krWlmtR8OJBcr0%3D&reserved=0>
>>
>> Hope this helps.
>>
>> Masood
>>
>> __________________
>> Masood Krohy, Ph.D.
>> Data Science Advisor|Platform Architect
>> https://www.analytical.works  <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.analytical.works%2F&data=02%7C01%7Cmmarkham%40precisionlender.com%7C5de463febcd142287ba208d7d8384f1c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C1%7C637215608668550345&sdata=1e07VVnMzpaUTR4ppvZxY5XCEcfRzCX7gA6YgdlWWaU%3D&reserved=0>
>>
>> On 4/3/20 8:23 AM, Marshall Markham wrote:
>>
>>     Hi Team,
>>
>>     My team recently conducted a POC of Kubernetes/Airflow/Spark with
>>     great success. The major concern we have about this system, after
>>     the completion of our POC is a behavior of spark-submit. When
>>     called with a Kubernetes API endpoint as master spark-submit
>>     seems to always return exit status 0. This is obviously a major
>>     issue preventing us from conditioning job graphs on the success
>>     or failure of our Spark jobs. I found Jira ticket SPARK-27697
>>     under the Apache issues covering this bug. The ticket is listed
>>     as minor and does not seem to have any activity recently. I would
>>     like to up vote it and ask if there is anything I can do to move
>>     this forward. This could be the one thing standing between my
>>     team and our preferred batch workload implementation. Thank you.
>>
>>     *Marshall Markham*
>>
>>     Data Engineer
>>
>>     PrecisionLender, a Q2 Company
>>
>>     NOTE: This communication and any attachments are for the sole use
>>     of the intended recipient(s) and may contain confidential and/or
>>     privileged information. Any unauthorized review, use, disclosure
>>     or distribution is prohibited. If you are not the intended
>>     recipient, please contact the sender by replying to this email,
>>     and destroy all copies of the original message.
>>
>> NOTE: This communication and any attachments are for the sole use of 
>> the intended recipient(s) and may contain confidential and/or 
>> privileged information. Any unauthorized review, use, disclosure or 
>> distribution is prohibited. If you are not the intended recipient, 
>> please contact the sender by replying to this email, and destroy all 
>> copies of the original message. 

Mime
View raw message