nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tony Kurc <trk...@gmail.com>
Subject heisenbug causing "lost" content claims
Date Fri, 04 Mar 2016 19:57:03 GMT
All,
I wanted to describe an issue on a nifi instance we've been using 0.4.1 on,
and why diagnosing it and reproducing it may be difficult. This is on a
linux server, where we have a reasonably high load, and the error happens
infrequently, but when it does, it really gums up operations.

At some point we get an IOException for too many open files. (with an
awfully high limit of open files in ulimit, so not sure why that is
happening).

Some time later, when trying to read a flowfile in a processor, we get a
ContentNotFoundException because presumably a flowfile is pointing to
content that was never written. When this happens, we basically have to
remove the flowfile manually (and if no one is watching at the moment or
the processor that reads isn't configured to handle this, or if you're not
using 0.5.x where you can selectively remove flowfiles from a queue this
can cause operational challenges).

Because this happens so infrequently, I'm not sure if others have seen
this. I'm not sure if something in the framework may need to adjustment if
a content claim goes wrong, but I really didn't expect that a flowfile with
no actual content should be able to be created, which seems to be what
happened (rather than the content being deleted or corrupted).

Anyone else experience this, or know maybe if something in 0.5.X may have
addressed this (looking through the release notes, nothing jumped out).

Tony

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message