kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Kaplinskiy <m...@ladderlife.com.INVALID>
Subject Re: Parallel computation of windows in Flink
Date Sun, 09 Jun 2019 20:30:12 GMT
Sorry about that - this is definitely the wrong list. I meant to send that
to the apache beam list.

Ladder <http://bit.ly/1VRtWfS>. The smart, modern way to insure your life.


On Sat, Jun 8, 2019 at 12:44 PM Mike Kaplinskiy <mike@ladderlife.com> wrote:

> Hi everyone,
>
> I’m using a Kafka source with a lot of watermark skew (i.e. new partitions
> were added to the topic over time). The sink is a
> FileIO.Write().withNumShards(1) to get ~ 1 file per day & an early trigger
> to write at most 40,000 records per file. Unfortunately it looks like
> there's 1 thread trying to write files for all the various days, instead of
> writing multiple days' files in parallel. Is there anything I could do here
> to parallelize the process? All of this is with the Flink runner.
>
> Mike.
>
> Ladder <http://bit.ly/1VRtWfS>. The smart, modern way to insure your life.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message