nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Tyukin <bo...@boristyukin.com>
Subject Re: Ingesting golden gate messages to Hbase using Nifi
Date Mon, 05 Nov 2018 14:16:33 GMT
Hi Faisal, I am not Timothy, but you raise an interesting problem we might
face soon as well. I did not expect the situation you described and I
thought transaction time would be different.

Our intent was to use op_ts to enforce order but another option is to use
GG rbc value or  oracle rowscn value  - did you consider them? GG
RBC should identify unique transaction and within every transaction, you
can also get operation# within a transaction. Also you can get trail file#
and trail file position. GG is really powerful and gives you a bunch of
data elements that you can enable on your message.

https://docs.oracle.com/goldengate/1212/gg-winux/GWUAD/wu_fileformats.htm#GWUAD735

Logdump tool is an awesome tool to look into your trail files and see
what's in there.

Boris



On Mon, Nov 5, 2018 at 3:07 AM Faisal Durrani <te04.0172@gmail.com> wrote:

> Hi Timothy ,
>
> Hope you are doing well. We have been using your data flow(
> https://community.hortonworks.com/content/kbentry/155527/ingesting-golden-gate-records-from-apache-kafka-an.html#
> )
> with slight modifications to store the data in Hbase. To version the rows
> we have been using Op_ts of golden gate json . But now we have found that
> multiple transactions can have the same Op_ts.  e.g. both update or delete
> can have the same Op_ts and if they arrive out of order to the PutHbaseJson
> processor then it can cause the target table to go out of sync. I am using
> the a cluster of nifi nodes so i cannot use Enforceorder processor to order
> the kafka messages as i understand it only order the flow files on a single
> node only and not across the cluster. Additionally we have a separate topic
> for each table and we have several consumer groups. I tried using the
> Current_ts column of the golden gate message but then if GG abends and
> restart the replication it will send the past data with the newer
> current_ts which will also cause the un-sync. I was wondering if you can
> give any idea so that we can order our transaction correctly.
>
> Regards,
> Faisal
>

Mime
View raw message