samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Logvinenko <mlogvine...@gmail.com>
Subject Re: Review Request 51633: SAMZA-1013: Add YARN Node label support
Date Fri, 07 Oct 2016 23:43:23 GMT


> On Oct. 5, 2016, 6:18 p.m., Jagadish Venkatraman wrote:
> > Overall, the patch looks great! This is exciting given that Samza can support scheduling
based on tags. For example, jobs with rocksdb can be assigned to nodes with SSDs.
> > 
> > 
> > Can you please add some detail on testing this feature? 
> > What was the label setup of the cluster? (for example: Did we use an exclusive node
label?), How many node labels? How many containers were requested for the job?
> 
> Maxim Logvinenko wrote:
>     We haven't tested it in production, but the main idea is the next: we have 3 different
types of nodes in our hadoop cluster. The first type is used for ApplicationMasters (actually,
we put up to 4 AM containers on one node). The second type is used for stateless jobs and
this type of nodes has a small amount of memory. And the last type is used for stateful jobs
and has more memory than others. So, there are 3 labels in our cluster: taskam, tasklowmem,
taskhighmem. Now we force YARN to put containers on a particular type of nodes by a small
trick with resources (we chose resources for node in such a way that YARN doesn't have any
other variants except only one type of nodes). But hadoop labels is a more natural way to
request containers to be placed on a specific node's type.
> 
> Jagadish Venkatraman wrote:
>     So, in this case, do you not care about *host affinity* at all when the job re-starts?
Are you okay with your container coming back up on a different host (as long as it is a host
with label `taskHighMem`)? We should make it explicit that when host-affinity.enabled=true,
then node labelling will be ignored. Is my understanding reasonable?

Seems like hadoop has bug (or feature, don't know how to call that). Node label expression
is ignored if preferred host != "ANY". So, if we run samza job for the first time it has no
preferred host, and hence it will use label for resource request. Each consequent resource
request for this container will use preferred host and put container on the same node as before.
But if this node is failed (or not reachable by any other reason) samza will still send preferred
host in resource request but hadoop can allocate this resource on any node which will fit
<vcores, memory> conditions. Need invesigate it more to answer this question.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51633/#review151519
-----------------------------------------------------------


On Oct. 7, 2016, 12:08 a.m., Maxim Logvinenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51633/
> -----------------------------------------------------------
> 
> (Updated Oct. 7, 2016, 12:08 a.m.)
> 
> 
> Review request for samza.
> 
> 
> Bugs: SAMZA-1013
>     https://issues.apache.org/jira/browse/SAMZA-1013
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> YARN Node labels were introduced in Hadoop version 2.6, which allows to group nodes with
similar characteristics and allows applications to specify where to run. This patch adds support
for YARN node labels in Samza.
> 
> In this implementation, node labels are defined directly in yarnConfig in YarnClusterResourceManager.
It might be better to have node labels as a part of SamzaResourceRequest and SamzaResource
classes, but org.apache.hadoop.yarn.api.records.Container class doesn't contain node label
and hence we have nothing to pass to the SamzaResource constructor in onContainersAllocated
method of YarnClusterResourceManager class.
> 
> 
> Diffs
> -----
> 
>   samza-yarn/src/main/java/org/apache/samza/config/YarnConfig.java 8f2dc48 
>   samza-yarn/src/main/java/org/apache/samza/job/yarn/YarnClusterResourceManager.java
96d3d7c 
>   samza-yarn/src/main/scala/org/apache/samza/job/yarn/ClientHelper.scala 0998c43 
> 
> Diff: https://reviews.apache.org/r/51633/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Maxim Logvinenko
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message