nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aldrin Piri <aldrinp...@gmail.com>
Subject Re: Best way to process the process the requests in batch
Date Mon, 06 Jun 2016 16:15:46 GMT
Kumiko,

In terms of increasing throughput from the standpoint of the NiFi
framework, it is possible to increase the number of concurrent tasks on the
processor under the Scheduling tab when configuring.  This will allow more
processes to execute simultaneously providing greater throughput.  Along
these lines, you could then optionally perform a SplitText on your context
file and treat them as separate events to allow parallelization within the
processor.

More areas of improvement would likely be centered around the specific
API(s) with which you are interacting and your custom processor which we
could explore further if the above approaches do not work.

NiFi has some metrics in terms of a component's activity in the stats
displayed from the vantage point of FlowFiles, but does not have visibility
into your processor.  In your case, sending a context file as described
would lead to several requests.  The above mentioned SplitText approach
could aid with the idea of keeping track of a quota, wherein there is a
one-to-one mapping of endpoint request to FlowFile and used in conjunction
with a ControlRate processor [1].

[1]
http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ControlRate/index.html


On Thu, May 26, 2016 at 8:34 PM, Kumiko Yada <Kumiko.Yada@ds-iq.com> wrote:

> Hello,
>
>
>
> We implemented the custom process that are similar to the InvokeHTTP that
> the part of URL can be replace with the Context Data List, then write the
> weather to the flowfile.  For example, URL to get the weather feed have to
> include the zip code in URL, and the ZIP code is {0} in the URL and
> replaced the zip code from the Context Data List property.
>
>
>
> URL
>
> http://example{0}/weather
>
>
>
> Context Data List:
>
> 00000
>
> 11111
>
> 22222
>
>
>
> Processor with make the following requests:
>
> http://example{0}/weather
>
>
>
> http://example00000/weather
>
> http://example11111/weather
>
> http://example22222/weather
>
>
>
> This processor is processed in one request at a time and have a perf
> issue.  I’d like to modify to process in batches.  What are the best way to
> process in batches?  And also, would the Nifi keep track how many requests
> the processor is processed?  If so, how the Nifi keep track this and how
> long the Nifi keep track of data?  I’d like to add the quota priorities in
> this processor to keep track of quota.  For example, if the weather feeds
> can be requested only 100 requests a day, I don’t want to processor to
> executed once the quota is reached.
>
>
>
> Thanks
>
> Kumiko
>

Mime
View raw message