flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aneesha Kaushal <aneesha.kaus...@reflektion.com>
Subject Re: Flink Batch job: All slots for groupReduce task scheduled on same machine
Date Tue, 20 Feb 2018 10:09:45 GMT
Hello Aljoscha

I looked into the Subtasks session on Flink Dashboard, for the about two tasks.

Thanks
Aneesha

> On 20-Feb-2018, at 3:32 PM, Aljoscha Krettek <aljoscha@apache.org> wrote:
> 
> Hi,
> 
> Could you please also post where/how you see which tasks are mapped to which slots/TaskManagers?
> 
> Best,
> Aljoscha
> 
>> On 20. Feb 2018, at 10:50, Aneesha Kaushal <aneesha.kaushal@reflektion.com <mailto:aneesha.kaushal@reflektion.com>>
wrote:
>> 
>> Hello, 
>> 
>> I have a fink batch job, where I am grouping dataset on some keys, and then using
group reduce. Parallelism is set to 16. 
>> The slots for the Map task is distributed across all the machines, but for GroupReduce
all the slots are being assigned to the same machine. Can you help me understand why/when
this can happen? 
>> Code looks something like: 
>> dataset.map(MapFunction())
>>   .groupBy(<keys to groupon>)
>>   .sortGroup(<key to sort on>, Order.DESCENDING)
>>   .reduceGroup(GroupReduceFunction()).name("Group reduce")
>> From flink dashboard: 
>> 
>> <Screen Shot 2018-02-20 at 2.39.35 PM.png>
>> 
>> 
>> Thanks in advance
>> Aneesha
>> 
>> 
>> 
>> 
> 


Mime
View raw message