nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Dean <tim.d...@gmail.com>
Subject Fine-grained control over when a NiFi processor can run
Date Fri, 13 Apr 2018 04:20:41 GMT
Hello,

I have a custom NiFi processor that invokes an external HTTP endpoint. That endpoint will
be hosted by services running at customer sites, and those customer sites require the ability
to define when the service can be called by my processor. Their goal is to prevent calls from
coming in during their peak hours so that they only have to process my requests during a configurable
set of off-peak hours.

Additionally, we have a goal of keeping the code for making the HTTP request separate from
the code for checking whether or not it is currently in a time window that requests are allowed.
This is not a strict requirement, but we have many different scenarios that would like to
use the HTTP request processor without any schedule restrictions and still other scenarios
that would like to check schedule restrictions before running other processors.

My first idea for this was to implement 2 different custom processors, one to make the HTTP
request and another to check the current time against the configured schedule restrictions.
Flow files would first enter the schedule restriction processor, and transfer to a “success”
relationship only if the request is currently permitted against the schedule. That success
relationship would then be connected to the HTTP request processor.

The potential problem I see with this is that flow files could back up for some reason between
the schedule restriction check processor and the HTTP requests. So a flow file could pass
the schedule restriction check, wait for a while until the HTTP request processor picks up
the work, and then end up sending an HTTP request outside of the permitted schedule window.

I could avoid this problem completely by combining the logic into a single processor, but
that makes it more difficult to reuse these processors in different combinations for the other
scenarios mentioned above.

I’m looking for other options to consider for addressing this workflow. I have a couple
of thoughts:
Implement the HTTP processor independently and then a second processor that subclasses the
first to add schedule restrictions. This keeps the two bits of code separate but doesn’t
give as much flexibility as I’d like
Just implement this as 2 separate processors and try to figure out some way in the flow to
prevent flow files from backing up between these 2 processors (not sure if this is possible)
Implement the schedule restriction as a particular implementation of a controller service
interface, and have the HTTP request processor depend on an instance of that controller. Alternate
implementations of that controller service interface could be created that exclude the schedule
restriction check.

Any thoughts on these approaches? Do any alternatives come to mind that I am missing?

Thanks in advance

-Tim
Mime
View raw message