nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charlie Frasure <charliefras...@gmail.com>
Subject Re: queued files
Date Fri, 20 Nov 2015 14:17:57 GMT
Thanks Joe,

The use case is that I'm receiving data without knowing what character set
it is coming in.  --mime-encoding is giving it's best guess on character
set rather than the content type.

The ListFile sounds interesting, but I wonder if I really even need that.
I don't want to leave the files in place, I just want to run an external
command on them as part of the data flow.  Is there a way I can run an
external command against the physical file such as
/opt/nifi/somedir/12345.uuid?
Would that info be in an attribute somewhere?  It just seems wasteful to
make an extra copy of the file, in order to run a read-only command on it,
then delete it.  If ListFiles is still the right way to go, please let me
know.


On Fri, Nov 20, 2015 at 6:45 AM, Joe Witt <joe.witt@gmail.com> wrote:

> For identifying the mime type you may have sufficient results with the
> existing processor 'IdentifyMimeType' which you can put into the flow.
>
> For better logic around identifying files to pull but first calling an
> external command to learn more about them the upcoming
> ListFile/FetchFile combo that comes from this JIRA [1] might give you
> better flexibility.
>
> [1] https://issues.apache.org/jira/browse/NIFI-631
>
> Thanks
> Joe
>
> On Fri, Nov 20, 2015 at 12:08 AM, Charlie Frasure
> <charliefrasure@gmail.com> wrote:
> > Thanks everyone for the help.  The trouble started a few processors
> earlier
> > in an ExecuteStreamCommand on ${filename} with the result of "file not
> > found".  I had originally set my GetFile processor to not remove files,
> but
> > recently changed that.  Now it seems that my ExecuteStreamCommand may
> not be
> > the best way to accomplish this.
> >
> > The command that gets executed is: file -b --mime-encoding ${filename}
> > in the working directory: ${absolute.path}
> >
> > Now that the file is no longer in the source directory when the processor
> > fires, the command is broken.  I could PutFile somewhere temporarily; is
> > there a better way?
> >
> > On Thu, Nov 19, 2015 at 10:33 PM, Joe Witt <joe.witt@gmail.com> wrote:
> >>
> >> Charlie,
> >>
> >> The fact that this is confusing is something we agree should be more
> >> clear and we will improve.  We're tackling it based on what is
> >> mentioned here [1].
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/NIFI/Interactive+Queue+Management
> >>
> >> Thanks
> >> Joe
> >>
> >> On Thu, Nov 19, 2015 at 10:30 PM, Corey Flowers <cflowers@onyxpoint.com
> >
> >> wrote:
> >> > These guys are right. The file to look in for the uuid is the
> >> > nifi-app.log.
> >> > Also if you wanted to see what the processor itself was doing, you
> could
> >> > right click on the processor, get its uuid and while it is running,
> run
> >> > (assuming it is on Linux):
> >> >
> >> > tail -F nifi-app.log | grep uuid
> >> >
> >> > This will just scroll the logs for that specific processor and will
> show
> >> > you
> >> > what it is doing. It should also tell you specific file names and
> uuids
> >> > of
> >> > the failing files.
> >> >
> >> > Hope that helps! Have a great night and good luck!
> >> >
> >> > Sent from my iPhone
> >> >
> >> > On Nov 19, 2015, at 9:27 PM, Juan Sequeiros <hellojuan@gmail.com>
> wrote:
> >> >
> >> > You can also check the NiFi logs for a searchable id or for what the
> >> > previous processor ID produced to help search provenance.
> >> >
> >> > On Nov 19, 2015 21:22, "Bryan Bende" <bbende@gmail.com> wrote:
> >> >>
> >> >> Charlie,
> >> >>
> >> >> The behavior you described usually means that the processor
> encountered
> >> >> an
> >> >> unexpected error which was thrown back to the framework which rolls
> >> >> back the
> >> >> processing of that flow file and leaves it in the queue, as opposed
> to
> >> >> an
> >> >> error it expected where it would usually route to a failure
> >> >> relationship.
> >> >>
> >> >> Is the id that you see in the bulletin a uuid?
> >> >>
> >> >> There should still be some provenance events for this FlowFile from
> the
> >> >> previous points in the flow. If it looks like the uuid of the
> FlowFile,
> >> >> that
> >> >> should be searchable from provenance using the search button on the
> >> >> right.
> >> >> Let us know if we can help more.
> >> >>
> >> >> -Bryan
> >> >>
> >> >> On Thu, Nov 19, 2015 at 9:10 PM, Charlie Frasure
> >> >> <charliefrasure@gmail.com> wrote:
> >> >>>
> >> >>> I have a question on troubleshooting a flow.  I've built a flow
with
> >> >>> no
> >> >>> exception routing, just trying to process the expected values first.
> >> >>> When a
> >> >>> file exposes a problem with the logic in my flow, it queues up
prior
> >> >>> to the
> >> >>> flow that is raising the bulletin.
> >> >>>
> >> >>> In the bulletin, I can see an id, but can't tell which file it
is.
> >> >>> Data
> >> >>> provenance doesn't seem to help as it passed the flow on the last
> >> >>> processor,
> >> >>> but hasn't been logged (to my knowledge) on the next one.
> >> >>>
> >> >>> Is there a way to match the bulletin back to a file without
> creating a
> >> >>> route for failed files?
> >> >>
> >> >>
> >> >
> >
> >
>

Mime
View raw message