nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DEHAY Aurelien <aurelien.de...@faurecia.com>
Subject RE: QueryDatabaseTable date/timestamp and Oracle.
Date Wed, 02 May 2018 13:36:04 GMT
Hello.

It’s too bad it’s so complicated when the oracle adapter should work.I don’t have a
relational database to store the records, so I’m trying to use kafka to store the last seen
max, queryrecord (which does not work I don’t know why) to determine it, but… it does
not work.

We will fallback to a python script in crontab, it’s ugly but at least it’s working.

From: Boris Tyukin [mailto:boris@boristyukin.com]
Sent: mardi 24 avril 2018 17:04
To: users@nifi.apache.org
Subject: Re: QueryDatabaseTable date/timestamp and Oracle.

IMHO A lot of stuff with Oracle requires custom tuning and unfortunately, a lot of generic
processors build to serve all JDBC databases would not have knowledge of Oracle intricacies.

I found out the same thing with Kafka JDBC connector that did crazy wrong stuff with data
coming from Oracle.

My solution was to build my own pipeline, using ExecuteSQL and persisting timestamps in MySql
table on a side. You will have more control that way and you can add your own audit and logging.
It was quite easy with NiFi and I was in full control of Oracle queries, executed by NiFi.

Boris


On Tue, Apr 24, 2018 at 6:04 AM, DEHAY Aurelien <aurelien.dehay@faurecia.com<mailto:aurelien.dehay@faurecia.com>>
wrote:
Hello.

I'm using Oracle 11g with nifi 1.6.0.

I'm trying to use QueryDatabaseTable to fetch newly modified/created rows from a table with
a few million records.

The field I use as maximum value column is "sysmodtime", a DATE field. When I put something
like 2018-04-23 00:00:00.0 in "initial.maxvalue.sysmodtime", the query times out. In a sql
command line, it takes several minutes to complete.

The query launched with nifi use as where clause: sysmodtime > timestamp '2018-04-23 00:00:00.0'.
With to_date, like in "select count(*) from hpsm.probsummarym1 where sysmodtime > to_date('2016-01-01
00:00:00', 'YYYY-MM-DD HH24;MI:SS')", it takes seconds.

The explanation seems about conversion between DATE type to TIMESTAMP type for every rows,
which obviously takes time.

Is there a workaround? I saw https://issues.apache.org/jira/browse/NIFI-2323 but it does not
really helps.

Thanks for any answer/pointer.

Aurélien DEHAY
Big Data Architect
+33 616 815 441
aurelien.dehay@faurecia.com<mailto:aurelien.dehay@faurecia.com>

2 rue Hennape - 92735 Nanterre Cedex - France



This electronic transmission (and any attachments thereto) is intended solely for the use
of the addressee(s). It may contain confidential or legally privileged information. If you
are not the intended recipient of this message, you must delete it immediately and notify
the sender. Any unauthorized use or disclosure of this message is strictly prohibited.  Faurecia
does not guarantee the integrity of this transmission and shall therefore never be liable
if the message is altered or falsified nor for any virus, interception or damage to your system.


This electronic transmission (and any attachments thereto) is intended solely for the use
of the addressee(s). It may contain confidential or legally privileged information. If you
are not the intended recipient of this message, you must delete it immediately and notify
the sender. Any unauthorized use or disclosure of this message is strictly prohibited.  Faurecia
does not guarantee the integrity of this transmission and shall therefore never be liable
if the message is altered or falsified nor for any virus, interception or damage to your system.
Mime
View raw message