flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "vinoyang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-8886) Job isolation via scheduling in shared cluster
Date Tue, 20 Mar 2018 02:45:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-8886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405723#comment-16405723

vinoyang commented on FLINK-8886:

[~elevy] YARN has a feature named 'Node Label' since Apache Hadoop 2.6+ can match your requirements.
And the previous issue see *FLINK-7836* I have supported the flink client to specify YARN
node label expression. So you can run your job on YARN.

Or for standalone cluster mode, you can just split a big cluster into some smaller clusters
base on dimensions of job / resource / business and so on . 

I think introduce this feature into standalone cluster would make the scheduler more heavier
and complex. Because the standalone cluster's scheduler is not good at resource assignment
and management . Yarn and mesos would be better choice. 

[~till.rohrmann] What's your opinion?

> Job isolation via scheduling in shared cluster
> ----------------------------------------------
>                 Key: FLINK-8886
>                 URL: https://issues.apache.org/jira/browse/FLINK-8886
>             Project: Flink
>          Issue Type: Improvement
>          Components: Distributed Coordination, Local Runtime, Scheduler
>    Affects Versions: 1.5.0
>            Reporter: Elias Levy
>            Priority: Major
> Flink's TaskManager executes tasks from different jobs within the same JMV as threads.
 We prefer to isolate different jobs on their on JVM.  Thus, we must use different TMs for
different jobs.  As currently the scheduler will allocate task slots within a TM to tasks
from different jobs, that means we must stand up one cluster per job.  This is wasteful,
as it requires at least two JobManagers per cluster for high-availability, and the JMs have
low utilization.
> Additionally, different jobs may require different resources.  Some jobs are compute
heavy.  Some are IO heavy (lots of state in RocksDB).  At the moment the scheduler threats
all TMs are equivalent, except possibly in their number of available task slots.  Thus, one
is required to stand up multiple cluster if there is a need for different types of TMs.
> It would be useful if one could specify requirements on job, such that they are only
scheduled on a subset of TMs.  Properly configured, that would permit isolation of jobs in
a shared cluster and scheduling of jobs with specific resource needs.
> One possible implementation is to specify a set of tags on the TM config file which the
TMs used when registering with the JM, and another set of tags configured within the job or
supplied when submitting the job.  The scheduler could then match the tags in the job with
the tags in the TMs.  In a restrictive mode the scheduler would assign a job task to a TM
only if all tags match.  In a relaxed mode the scheduler could assign a job task to a TM
if there is a partial match, while giving preference to a more accurate match.

This message was sent by Atlassian JIRA

View raw message