nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James McMahon <>
Subject How to get a complete listing of flowfiles in a queue?
Date Tue, 14 Apr 2020 11:56:45 GMT
I have an issue with a ListFile processor. It does not appear to be
consuming all the raw data files that show up throughout the day in a
landing directory. My count at end of the day is less than the count of all
the files in the directory at end of the day. I suspect it has to do with
the way the ListFile has been configured (right now we only accept files
that are 30 minutes old or older), or it has to do with the fact that large
multiples of file can arrive at the same hh:mm differentiated by seconds or
milliseconds.  Perhaps ListFile is recording its state only to the
hour-minute or hour-minute-second (I notice that all millisecond values in
the epoch time are 000 in View State), and so when ListFile runs in its
following cycle it overlooks all the other files that share hh:mm, but are
later in time by some seconds or milliseconds on the file time? I'm
grasping for a logical cause at this point.

I want to do a comparison of what I have read in so far today against an
exhaustive list of today's directory. My intention is that such a
comparison should flag gaps, which then may lead me to a cause.

I have saved to a queue that persists the results of ListFile Success path
for 24 hours, which I started after all files yesterday had stopped
arriving (point being, queue will only have flowfiles in it from the today
directory). Right now it totals 16,231 flowfiles. The "read only" directory
on the linux system has nearly 20,000 files in it. Looking at the queue
from the UI isn't quite what I require: it only lets me view 100 flowfiles,
and I can't output the list.

Can I use the API or other option to generate the complete list of
flowfiles in that queue? I hope to output a list that includes Filename,
file.lastModifiedTime, and file.creationTime .
Thank you in advance for your help.

View raw message