nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Kawamura <ijokaruma...@gmail.com>
Subject Re: Data inconsistency happens when using CDC to replicate my database
Date Wed, 16 Oct 2019 00:32:24 GMT
Hi Lei,

To address FlowFile ordering issue related to CaptureChangeMySQL, I'd
recommend using EnforceOrder processor and FIFO prioritizer before a
processor that requires precise ordering. EnforceOrder can use
"cdc.sequence.id" attribute.

Thanks,
Koji

On Tue, Oct 15, 2019 at 1:14 PM wanglei2@geekplus.com.cn
<wanglei2@geekplus.com.cn> wrote:
>
>
> Seems it is related with which prioritizer is used.
> The inconsistency accurs when OldestFlowFileFirst prioritizer is used, but not accur
when FirstInFristOut prioritizer is used.
> But I have no idea why.
> Any insight on this?
>
> Thanks,
> Lei
>
>
> ________________________________
> wanglei2@geekplus.com.cn
>
>
> 发件人: wanglei2@geekplus.com.cn
> 发送时间: 2019-10-15 08:08
> 收件人: users
> 抄送: dev
> 主题: Data inconsistency happens when using CDC to replicate my database
> Using CaptureChangeMySQL to extract binlog, do some translation and then put to another
database with PutDatabaseRecord processor.
> But there's always data inconsitency between destination database and souce database.
To debug this, I have do the following settings.
>
> CaptureChangeMySQL only output one table. There's a field called order_no that is uniq
in the table.
> All the proessors are scheduled with only one concurrency.
> No data balance between nodes. All run on primary node
> After CaptureChangeMySQL, add a LogAttrubute processor called log1. Before PutDatabaseRecord,
also add a LogAttrubute, called log2.
>
> For the inconsistent data, i can  grep the order_no in log1 and log2.
> For one specified order_no, there's total 5  binlog message. But in log1, there's only
one message. In log2, there's 5, but the order is changed.
>
> position       type
> 201721167  insert (appeared in log1 and log2)
> 201926490  update(appeared only in log2)
> 202728760  update(appeared only in log2)
> 203162806  update(appeared only in log2)
> 203135127  update (appeared only in log2, the position number is smaller then privious
msg)
>
> This really confused me a lot.
> Any insight on this?  Thanks very much.
>
> Lei
>
> ________________________________
> wanglei2@geekplus.com.cn

Mime
View raw message