nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Kawamura <ijokaruma...@gmail.com>
Subject Re: [EXT] New to Nifi - Failed to update database due to a failed batch update
Date Wed, 04 Oct 2017 02:28:21 GMT
Hi Aruna,

I think the FlowFile is being processed by PutDatabase.
When a processor start processing a FlowFile, it consumes the FlowFile from
the queue, but the FlowFile is not removed completely from the queue until
the processor finishes processing it completely and commit the process
session.
During this phase, you see '1' FlowFile is queued in the incoming
relationship, but you don't see anything when you perform listing the queue
(because it's being processed).
PutDatabaseRecord executes a single batch operation to the database, so I
think you won't see the inserted rows until whole batch operation is
committed to the table, that would be the reason why you don't see rows
from Postgres.

I guess, the current situation is simply PutDatabase is being slower than
you might expect.

Although there might be some room for PutDatabase performance improvements,
if PostgreSQL COPY command is faster than PutDatabase, I'd suggest
executing COPY command from NiFi using PutSQL since you already have the
CSV file in local file system.
http://www.postgresqltutorial.com/import-csv-file-into-posgresql-table/

Usually database specific bulk loading command is faster than inserting
rows using generic SQL for synchronizing data at the first time.
After such initial copying, NiFi data flow will be very useful to ingest
new/updated records.

Thanks,
Koji

On Wed, Oct 4, 2017 at 12:26 AM, Aruna Sankaralingam <
Aruna.Sankaralingam@cormac-corp.com> wrote:

> Hi Koji,
>
>
>
> I updated the run schedule and set to 120 secs. As soon as the file is
> fetched, I stop the processor. The file is successfully ingested into S3.
> But it is not getting loaded into postgres. When I right click on the
> flowfile which shows Queued 1 in screenshot below and clicked on “List
> Queue “, it says the queue has no flow files.
>
> I am not able to understand this issue.
>
>
>
>
>
>
>
> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com]
> *Sent:* Friday, September 29, 2017 12:26 AM
>
> *To:* Aruna Sankaralingam
> *Cc:* users@nifi.apache.org; karthi keyan
> *Subject:* Re: [EXT] New to Nifi - Failed to update database due to a
> failed batch update
>
>
>
> Hi Aruna,
>
>
>
> How is your GetFile scheduled and configured?
>
> You can check GetFile's "Run Schedule" on its "SCHEDULING" tab.
>
> By default, "Run Schedule" is set to 0 sec, meaning it is scheduled to run
> as often as possible if there's available scheduler thread.
>
>
>
> I think GetFile is getting the same file over and over.
>
> Setting GetFile "Keep Source File" to "false" will remove a file after
> being read, so no duplication would occur.
>
>
>
> Or, you might want to use ListFile and FetchFile combination instead of
> GetFile.
>
> ListFile only picks newly added (or updated) files by checking file last
> modified timestamp.
>
>
>
> Thanks,
>
> Koji
>
>
>
> On Fri, Sep 29, 2017 at 12:49 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> I don’t have access to nifi.sh – it says permission denied.
>
>
>
> This is from nifi-app.log
>
> 2017-09-28 11:21:25,765 INFO [Write-Ahead Local State Provider
> Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.
> MinimalLockingWriteAheadLog@1db9c2cf checkpointed with 9 Records and 0
> Swap Files in 4 milliseconds (Stop-the-world time = 1 milliseconds, Clear
> Edit Logs time = 0 millis), max Transaction ID 26
>
> 2017-09-28 11:21:27,614 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository
> Initiating checkpoint of FlowFile Repository
>
> 2017-09-28 11:21:27,668 INFO [pool-10-thread-1] org.wali.MinimalLockingWriteAheadLog
> org.wali.MinimalLockingWriteAheadLog@2a0881f1 checkpointed with 30
> Records and 0 Swap Files in 53 milliseconds (Stop-the-world time = 29
> milliseconds, Clear Edit Logs time = 12 millis), max Transaction ID 1452
>
> 2017-09-28 11:21:27,668 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository
> Successfully checkpointed FlowFile Repository with 30 records in 53
> milliseconds
>
> 2017-09-28 11:22:00,860 INFO [StandardProcessScheduler Thread-1] o.a.n.c.s.TimerDrivenSchedulingAgent
> Scheduled PutDatabaseRecord[id=bebef98b-015e-1000-7e58-2740cea32e78] to
> run with 1 threads
>
> 2017-09-28 11:22:01,232 INFO [Flow Service Tasks Thread-1]
> o.a.nifi.controller.StandardFlowService Saved flow controller
> org.apache.nifi.controller.FlowController@79e662c0 // Another save
> pending = false
>
> 2017-09-28 11:23:25,771 INFO [Write-Ahead Local State Provider
> Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.
> MinimalLockingWriteAheadLog@1db9c2cf checkpointed with 9 Records and 0
> Swap Files in 5 milliseconds (Stop-the-world time = 1 milliseconds, Clear
> Edit Logs time = 0 millis), max Transaction ID 26
>
> 2017-09-28 11:23:27,668 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository
> Initiating checkpoint of FlowFile Repository
>
> 2017-09-28 11:23:27,743 INFO [pool-10-thread-1] org.wali.MinimalLockingWriteAheadLog
> org.wali.MinimalLockingWriteAheadLog@2a0881f1 checkpointed with 30
> Records and 0 Swap Files in 74 milliseconds (Stop-the-world time = 35
> milliseconds, Clear Edit Logs time = 20 millis), max Transaction ID 1452
>
> 2017-09-28 11:23:27,743 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository
> Successfully checkpointed FlowFile Repository with 30 records in 74
> milliseconds
>
> 2017-09-28 11:25:25,777 INFO [Write-Ahead Local State Provider
> Maintenance] org.wali.MinimalLockingWriteAheadLog org.wali.
> MinimalLockingWriteAheadLog@1db9c2cf checkpointed with 9 Records and 0
> Swap Files in 5 milliseconds (Stop-the-world time = 1 milliseconds, Clear
> Edit Logs time = 0 millis), max Transaction ID 26
>
> 2017-09-28 11:25:27,743 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository
> Initiating checkpoint of FlowFile Repository
>
> 2017-09-28 11:25:27,883 INFO [pool-10-thread-1] org.wali.MinimalLockingWriteAheadLog
> org.wali.MinimalLockingWriteAheadLog@2a0881f1 checkpointed with 30
> Records and 0 Swap Files in 140 milliseconds (Stop-the-world time = 105
> milliseconds, Clear Edit Logs time = 17 millis), max Transaction ID 1452
>
> 2017-09-28 11:25:27,883 INFO [pool-10-thread-1] o.a.n.c.r.WriteAheadFlowFileRepository
> Successfully checkpointed FlowFile Repository with 30 records in 140
> milliseconds
>
>
>
> This is from nifi-bootstrap.log
>
>
>
> 2017-09-28 10:58:39,824 INFO [main] o.a.n.b.NotificationServiceManager
> Successfully loaded the following 0 services: []
>
> 2017-09-28 10:58:39,830 INFO [main] org.apache.nifi.bootstrap.RunNiFi
> Registered no Notification Services for Notification Type NIFI_STARTED
>
> 2017-09-28 10:58:39,830 INFO [main] org.apache.nifi.bootstrap.RunNiFi
> Registered no Notification Services for Notification Type NIFI_STOPPED
>
> 2017-09-28 10:58:39,830 INFO [main] org.apache.nifi.bootstrap.RunNiFi
> Registered no Notification Services for Notification Type NIFI_DIED
>
> 2017-09-28 10:58:39,874 INFO [main] org.apache.nifi.bootstrap.Command
> Starting Apache NiFi...
>
> 2017-09-28 10:58:39,875 INFO [main] org.apache.nifi.bootstrap.Command
> Working Directory: /var/nifi/home
>
> 2017-09-28 10:58:39,875 INFO [main] org.apache.nifi.bootstrap.Command
> Command: java -classpath /var/nifi/home/./conf:/var/
> nifi/home/./lib/jul-to-slf4j-1.7.25.jar:/var/nifi/home/./
> lib/slf4j-api-1.7.25.jar:/var/nifi/home/./lib/nifi-
> properties-1.2.0.jar:/var/nifi/home/./lib/javax.servlet-
> api-3.1.0.jar:/var/nifi/home/./lib/nifi-nar-utils-1.2.0.jar:
> /var/nifi/home/./lib/nifi-runtime-1.2.0.jar:/var/nifi/
> home/./lib/jetty-schemas-3.1.jar:/var/nifi/home/./lib/
> logback-classic-1.2.3.jar:/var/nifi/home/./lib/logback-
> core-1.2.3.jar:/var/nifi/home/./lib/nifi-api-1.2.0.jar:/var/
> nifi/home/./lib/jcl-over-slf4j-1.7.25.jar:/var/nifi/
> home/./lib/nifi-framework-api-1.2.0.jar:/var/nifi/home/./
> lib/log4j-over-slf4j-1.7.25.jar -Dorg.apache.jasper.compiler.disablejsr199=true
> -Xmx512m -Xms512m -Djava.security.egd=file:/dev/urandom -Dsun.net.http.allowRestrictedHeaders=true
> -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true -XX:+UseG1GC
> -Djava.protocol.handler.pkgs=sun.net.www.protocol
> -Dnifi.properties.file.path=/var/nifi/home/./conf/nifi.properties
> -Dnifi.bootstrap.listen.port=35023 -Dapp=NiFi -Dorg.apache.nifi.bootstrap.
> config.log.dir=/var/nifi/home/logs org.apache.nifi.NiFi
>
> 2017-09-28 10:58:39,958 INFO [main] org.apache.nifi.bootstrap.Command
> Launched Apache NiFi with Process ID 2434
>
> 2017-09-28 10:58:40,755 INFO [NiFi Bootstrap Command Listener]
> org.apache.nifi.bootstrap.RunNiFi Apache NiFi now running and listening
> for Bootstrap requests on port 46741
>
>
>
>
>
> Just when I was getting these info, I noticed that data got inserted in
> the database. I didn’t do any changes since last night other than shutting
> down the AWS instance last night and started again today morning.
>
> But now again everything went back to 0 in the processor and FlowFile
> shows Queued 29 from Queued 30. All records got inserted in the table.
> Shouldn’t the flow file display 0?
>
> There is only one csv file with 959381 records totally.
>
>
>
>
>
> Hang on, it is still doing something. I am not able to understand what is
> happening now as I could see all the records got inserted already.
>
>
>
>
>
> I see duplicate records get inserted now.
>
>
>
>
>
>
>
>
>
>
>
> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com]
> *Sent:* Wednesday, September 27, 2017 10:15 PM
> *To:* Aruna Sankaralingam
> *Cc:* users@nifi.apache.org; karthi keyan
>
>
> *Subject:* Re: [EXT] New to Nifi - Failed to update database due to a
> failed batch update
>
>
>
> Hi Aruna,
>
>
>
> The XML files in the Gist page are NiFi Templates.
>
> You can import those XML from NiFi UI. Please look at this documentation
> for detail:
>
> https://nifi.apache.org/docs/nifi-docs/html/user-guide.
> html#Import_Template
>
>
>
> As to PutDatabase doing nothing, the '1' on the right top corner of
> PutDatabaseRecord indicates that one thread is running for this processor
> currently.
>
> That's strange if you don't see anything happening with it for 30 min, the
> thread may be blocked unexpectedly.
>
>
>
> If possible please take a thread dump with following command and share it
> with us:
>
> $NIFI_HOME/bin/nifi.sh dump
>
> Then thread dump is logged at
>
> $NIFI_HOME/logs/nifi-bootstrap
>
>
>
> Also, please share PutDatabaseRecord and its record reader configurations
> for further investigation.
>
>
>
> Thanks,
>
> Koji
>
>
>
>
>
> On Thu, Sep 28, 2017 at 1:48 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> Thank you Koji. Could you please let me know how I can import the xml so
> that I can see them as nifi processors?
>
> I updated my flow as shown below. When I started PutDatabaseRecord, it is
> not doing anything. It’s been more than 30 mins. I don’t see any errors as
> well. How do I find out what is wrong?
>
>
>
>
>
> *From:* Koji Kawamura [mailto:ijokarumawak@gmail.com]
> *Sent:* Tuesday, September 26, 2017 10:22 PM
>
>
> *To:* users@nifi.apache.org
> *Cc:* karthi keyan
> *Subject:* Re: [EXT] New to Nifi - Failed to update database due to a
> failed batch update
>
>
>
> Hi Aruna,
>
>
>
> To explain details, I've summarized two different approaches to load a CSV
> file into a Table in this Gist page:
>
> https://gist.github.com/ijokarumawak/b37db141b4d04c2da124c1a6d922f81f
>
>
>
> One is using ConvertCSVToAvro and few additional processors.
>
> I didn't use ReplaceText as I thought altering raw SQL string would be
> error prone.
>
> This approach should work with older version of NiFi (I see you're using
> NiFi 1.2.0 in your screenshot).
>
>
>
> The another way is to use PutDatabaseRecord.
>
> This is recommended if you're able to upgrade your NiFi installation.
>
>
>
> I hope you find these examples useful.
>
>
>
> Thanks,
>
> Koji
>
>
>
> On Tue, Sep 26, 2017 at 11:23 PM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> I am not sure I understand. This is how my CSV looks.
>
>
>
>
>
> -----Original Message-----
> From: Koji Kawamura [mailto:ijokarumawak@gmail.com]
> Sent: Monday, September 25, 2017 8:19 PM
> To: users@nifi.apache.org
> Cc: karthi keyan
> Subject: Re: [EXT] New to Nifi - Failed to update database due to a failed
> batch update
>
>
>
> Hi Aruna,
>
>
>
> The placeholders in your ReplaceText configuration, such as '${city_name}'
> are NiFi Expression Language. If the incoming FlowFile has such FlowFile
> Attributes, those can be replaced with FlowFile Attribute values. But I
> suspect FlowFile doesn't have those attributes since ReplaceText is
> connected right after FetchS3Object.
>
>
>
> You need to extract values from FlowFile content into FlowFile attribute
> somehow, for example, if the data fetched from S3 is a JSON, use
> EvaluateJsonPath before ReplaceText.
>
>
>
> BTW, I think you don't need to use FetchS3Object because PutS3Object
> passes the data object to its 'success' relationship. You can connect
> 'success' relationship to downstream flow like:
>
> PutS3Object -> EvaluateJsonPath -> ReplaceText -> PutSQL
>
>
>
> Also if you can upgrade NiFi to 1.3.0, PutDatabaseRecord can make the flow
> simpler and more efficient:
>
> PutS3Objecct -> PutDatabaseRecord (with arbitrary RecordReader)
>
>
>
> Thanks,
>
> Koji
>
>
>
>
>
> On Tue, Sep 26, 2017 at 12:47 AM, Aruna Sankaralingam <
> Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> > I updated the insert statement to be in a single line. Again it
>
> > failed. I checked the flow file.
>
> >
>
> >
>
> >
>
> > INSERT INTO ADR_SUB_NIFI (enrlmt_id, city_name, zip_cd, state_cd)
>
> > VALUES ('', '', '', '')
>
> >
>
> >
>
> >
>
> > What could be the reason for the values to be blank instead of actual
>
> > values from the CSV file?
>
> >
>
> >
>
> >
>
> > From: karthi keyan [mailto:karthi93.sankar@gmail.com
> <karthi93.sankar@gmail.com>]
>
> > Sent: Monday, September 25, 2017 7:15 AM
>
> > To: users@nifi.apache.org; Aruna Sankaralingam
>
> >
>
> >
>
> > Subject: Re: [EXT] New to Nifi - Failed to update database due to a
>
> > failed batch update
>
> >
>
> >
>
> >
>
> > Aruna,
>
> >
>
> >
>
> >
>
> > seems failure in your insert statement, don't split the Replacement
>
> > value(query) in the replacetext processor into multiple lines and try
>
> > to be in a single line?
>
> >
>
> >
>
> >
>
> > -Karthik
>
> >
>
> >
>
> >
>
> > On Mon, Sep 25, 2017 at 4:20 PM, karthi keyan
>
> > <karthi93.sankar@gmail.com>
>
> > wrote:
>
> >
>
> > Aruna,
>
> >
>
> >
>
> >
>
> > You can download the flow file to see whether your query passed
>
> > correctly and try execute the same with you datasoruce.
>
> >
>
> >
>
> >
>
> > -Karthik
>
> >
>
> >
>
> >
>
> > On Mon, Sep 25, 2017 at 4:04 PM, Aruna Sankaralingam
>
> > <Aruna.Sankaralingam@cormac-corp.com> wrote:
>
> >
>
> > I clicked on that as well but nothing seemed to happen.
>
> >
>
> > Thanks
>
> >
>
> > Aruna
>
> >
>
> >
>
> > On Sep 25, 2017, at 4:33 AM, Peter Wicks (pwicks) <pwicks@micron.com>
> wrote:
>
> >
>
> > Use the Download button right next to View, then open it in a text
> editor.
>
> >
>
> >
>
> >
>
> > From: Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
> <Aruna.Sankaralingam@Cormac-Corp.com>]
>
> > Sent: Monday, September 25, 2017 9:54 AM
>
> > To: users@nifi.apache.org
>
> > Subject: Re: [EXT] New to Nifi - Failed to update database due to a
>
> > failed batch update
>
> >
>
> >
>
> >
>
> > Hi, thank you for getting back. Could you please let me know how I can
>
> > see the contents of the flow file ? The view option doesn't seem to work
> for me.
>
> > Please see my last screenshot in my first email.
>
> >
>
> > Thanks
>
> >
>
> > Aruna
>
> >
>
> >
>
> > On Sep 24, 2017, at 8:52 PM, Peter Wicks (pwicks) <pwicks@micron.com>
> wrote:
>
> >
>
> > Hi Aruna,
>
> >
>
> >
>
> >
>
> > Since you are using ReplaceText, you can view the contents of the
>
> > FlowFile and check that you can copy/paste the SQL and execute it by
>
> > hand in Postgres.
>
> >
>
> >
>
> >
>
> > If all that works try setting the batch size on PutSQL to 1 record.
>
> > This will help check if it’s all records that are having trouble, or
>
> > just a few bad records.
>
> >
>
> >
>
> >
>
> > --Peter
>
> >
>
> >
>
> >
>
> > From: Aruna Sankaralingam [mailto:Aruna.Sankaralingam@Cormac-Corp.com
> <Aruna.Sankaralingam@Cormac-Corp.com>]
>
> > Sent: Saturday, September 23, 2017 2:57 AM
>
> > To: users@nifi.apache.org
>
> > Subject: [EXT] New to Nifi - Failed to update database due to a failed
>
> > batch update
>
> >
>
> >
>
> >
>
> > Hi,
>
> >
>
> >
>
> >
>
> > I am new to Nifi. I am trying to load a CSV file into S3 bucket and
>
> > then load into postgres database. Please see screenshots below. This
>
> > is what I have done. I am successful till “Replace Text”. But I am not
>
> > sure if the replace text is creating the insert query properly. When I
>
> > start the PutSQL, it fails with this error “Failed to update database
>
> > due to a failed batch update. There were a total of 30 FlowFiles that
>
> > failed, 0 that succeeded, and 0 that were not execute and will be routed
> to retry”
>
> >
>
> >
>
> >
>
> > I tried to see if I can find something in the failure flow file but
>
> > when I click on View or Download, nothing is happening. I would really
>
> > appreciate any kind of guidance to make this work.
>
> >
>
> >
>
> >
>
> > <image001.jpg>
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > <image002.jpg>
>
> >
>
> >
>
> >
>
> > <image003.jpg>
>
> >
>
> >
>
> >
>
> >
>
> >
>
> >
>
>
>
>
>
>
>

Mime
View raw message