If it is not some other user then its the kernal triggering the kill, it might be using way too much memory or swap. Check your resource usage while the job is running and see the memory overhead etc.

Thanks
Best Regards

On Tue, Sep 1, 2015 at 5:56 PM, Silvio Bernardinello <sbernardinello@beintoo.com> wrote:
Hi,

We are running Spark 1.4.0 on a Mesosphere cluster (~250GB memory with 16 activated hosts).
Spark jobs are submitted in coarse mode.

Suddenly, our jobs get killed without any error. 

ip-10-0-2-193.us-west-2.compute.internal, PROCESS_LOCAL, 1514 bytes)
15/09/01 10:48:24 INFO TaskSetManager: Finished task 38047.0 in stage 0.0 (TID 38160) in 2856 ms on ip-10-0-0-203.us-west-2.compute.internal (38048/44617)
15/09/01 10:48:24 INFO TaskSetManager: Starting task 38056.0 in stage 0.0 (TID 38169, ip-10-0-0-204.us-west-2.compute.internal, PROCESS_LOCAL, 1514 bytes)
15/09/01 10:48:24 INFO TaskSetManager: Starting task 38057.0 in stage 0.0 (TID 38170, ip-10-0-0-204.us-west-2.compute.internal, PROCESS_LOCAL, 1514 bytes)
15/09/01 10:48:25 INFO TaskSetManager: Finished task 38048.0 in stage 0.0 (TID 38161) in 2290 ms on ip-10-0-2-194.us-west-2.compute.internal (38049/44617)
Killed

Where can we find additional information to this issue?

Thank in advance

Silvio



__________________________________________________________


Silvio Bernardinello  |  Data Engineer


Milan | Rome | New York | Shanghai

    

Beintoo Spa - Corso di Porta Romana, 68 - 20122 Milano - Italy - Office (+39) 02.97.687.959

This email is reserved exclusively for sending and receiving messages inherent working activities, and is not intended nor authorized for personal use. Therefore, any outgoing messages or incoming response messages will be treated as company messages and will be subject to the corporate IT policy and may possibly to be read by persons other than by the subscriber of the box. Confidential information may be contained in this message. If you are not the address indicated in this message, please do not copy or deliver this message to anyone. In such case, you should notify the sender immediately and delete the original message.