We found the underlying issue with the following problem.
Few of our jobs are stuck with logs , these jobs are only able to allocate JM and couldn't get any TM, however, there are ample resource on our cluster.
We are running ETL merge job here. In this job, we first find new deltas and if there is no delta detected then we make exit without actually executing the job. I think this is the reason we see no TM allocation is happening.
I believe in above case (non-detached mode) we should mark the submitted application as complete compare to running. Please share your thoughts on this.
Should I log this improvement in JIRA?
Could you also recommend us the best practice in FLIP6, should we use YARN session or submit jobs in non-detached mode?