nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Thomsen <mikerthom...@gmail.com>
Subject Re: Nifi capabilities
Date Mon, 12 Aug 2019 15:44:25 GMT
Our recommendation to our clients is usually the same. We've saved them a
lot of money by focusing on a more "slow and steady wins the race" pace for
data ingestion since there is no critical business need to get the data
ingested anything like real time.

On Mon, Aug 12, 2019 at 8:58 AM Pierre Villard <pierre.villard.fr@gmail.com>
wrote:

> It mainly depends of your workloads. NiFi is not memory consuming unless
> you're doing specific operations on the data / use memory intensive
> processors. For high performance you'd likely go for CPU-optimized VMs with
> attached SSDs for repositories. But my recommendation is to start small and
> adapt your setup based on your needs / observations.
>
> Le lun. 12 août 2019 à 14:46, Dweep Sharma <dweep.sharma@redbus.com> a
> écrit :
>
>> Thanks,
>>
>> We are pretty much on the AWS cloud and Hardware/OS failures are very
>> unlikely.
>>
>> Can you please suggest a  machine type on AWS, I am considering
>> M5.xLarge.
>>
>> Need to choose a machine type based on  prioritizing.
>> 1) High Disk I/O
>> 2) Memory
>> 3) CPU
>>
>> -Dweep
>>
>> On Mon, Aug 5, 2019 at 5:32 PM Purushotham Pushpavanthar <
>> pushpavanthar@gmail.com> wrote:
>>
>>> Hi Dweep,
>>>
>>> I would like to add to Pierre Villard's insightful answer.
>>>  2)  NiFi having at least 3 filesystem repositories, multiple write and
>>> read occur on same record on different stages of a single pipeline. This
>>> demands for high IOPS. Vertical scaling of IOPS is very costly/leads to
>>> roadblock sometimes which can be handled better in clustered mode by load
>>> balancing of flowfiles.
>>>
>>> Regards,
>>> Purushotham Pushpavanth
>>>
>>>
>>>
>>> On Mon, 5 Aug 2019 at 15:37, Pierre Villard <pierre.villard.fr@gmail.com>
>>> wrote:
>>>
>>>> Hi Dweep,
>>>>
>>>> I'll let other chime in, but here are some answers to your questions:
>>>>
>>>> 1) Yes - NiFi supports a very fine-grained authorizations model and
>>>> authentication mechanisms.
>>>> Authentication:
>>>> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#user_authentication
>>>> Authorization:
>>>> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#multi-tenant-authorization
>>>>
>>>> You can also find resources on the Internet on how to setup
>>>> authentication & authorization.
>>>>
>>>> 2) I'd say that it is up to your requirements and if you need high
>>>> availability. From a pure performance standpoint, vertical scaling is
>>>> probably enough for your use case unless you have very huge amounts of
>>>> data. Clustering will help you achieve even better performance (millions
of
>>>> events per second), and will improve reliability in case of failure.
>>>>
>>>> 3) Yes the data is persisted. There are some parameters that you can
>>>> tune based on your tolerance against data loss.
>>>> Example: nifi.flowfile.repository.always.sync - If set to true, any change
>>>> to the repository will be synchronized to the disk, meaning that NiFi will
>>>> ask the operating system not to cache the information. This is very
>>>> expensive and can significantly reduce NiFi performance. However, if it is
>>>> false, there could be the potential for data loss if either there is a
>>>> sudden power loss or the operating system crashes. The default value is
>>>> false.
>>>>
>>>> In other words, unless you have serious hardware/OS failures, you
>>>> should not lose any data. And everything will be persisted/restart upon
>>>> NiFi restart. In case data loss is a critical part of your system, using
a
>>>> broker like Kafka with the ability to replay events could be a possible
>>>> solution.
>>>>
>>>> 4) I recommend this awesome post by Bryan:
>>>> https://bryanbende.com/development/2016/09/15/apache-nifi-and-apache-kafka
>>>>
>>>> 5) There are some options available for the metrics. You can have a
>>>> look at reporting tasks for this purpose. A set or articles you can read
is
>>>> available here:
>>>> https://pierrevillard.com/2017/05/11/monitoring-nifi-introduction/
>>>>
>>>> Hope this helps!
>>>> Pierre
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Le lun. 5 août 2019 à 07:11, Dweep Sharma <dweep.sharma@redbus.com>
a
>>>> écrit :
>>>>
>>>>> Hi All,
>>>>>
>>>>> I have been using Nifi to setup some pipelines now. Before I can
>>>>> absorb more use cases into this, I need to understand a few capabilities
>>>>>
>>>>> 1) Can we setup an user authentication before the web application. If
>>>>> yes, is there a way we can have role based access for processor groups.
I
>>>>> would like certain teams working on only specific groups and not control
>>>>> all.
>>>>>
>>>>> 2) If the major use case would only involve reading from RMQ, KAFKA
>>>>> convert to parquet and store in S3, does it make sense to setup a cluster
>>>>> or just vertical scaling is good ?
>>>>>
>>>>> 3) Are the flow files in the queues (connections between processors)
>>>>> persisted?. Any machine failure or restart would cause a loss of data
? For
>>>>> instance messages are dequeued form RMQ and lost due to failure. Which
>>>>> would be a best way to handle this ? I think maintaining a low back
>>>>> pressure (threshold) can help mitigate the loss
>>>>>
>>>>> 4) Does the Kafka consumer, by default consume all partitions or is
>>>>> there a way to control that.
>>>>>
>>>>> 5) Can we have some of the metrics of processors pushed out as
>>>>> notifications or alerts (flow file count in / out or errors etc)
>>>>>
>>>>> It would be great, if someone could share resources that address
>>>>> these.
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> -Dweep
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *::DISCLAIMER::----------------------------------------------------------------------------------------------------------------------------------------------------The
>>>>> contents of this e-mail and any attachments are confidential and intended
>>>>> for the named recipient(s) only.E-mail transmission is not guaranteed
to be
>>>>> secure or error-free as information could be intercepted, corrupted,lost,
>>>>> destroyed, arrive late or incomplete, or may contain viruses in
>>>>> transmission. The e mail and its contents(with or without referred errors)
>>>>> shall therefore not attach any liability on the originator or redBus.com.
>>>>> Views or opinions, if any, presented in this email are solely those of
the
>>>>> author and may not necessarily reflect the views or opinions of redBus.com.
>>>>> Any form of reproduction, dissemination, copying, disclosure,
>>>>> modification,distribution and / or publication of this message without
the
>>>>> prior written consent of authorized representative of redbus.
>>>>> <http://redbus.in/>com is strictly prohibited. If you have received
this
>>>>> email in error please delete it and notify the sender immediately.Before
>>>>> opening any email and/or attachments, please check them for viruses and
>>>>> other defects.*
>>>>
>>>>
>>
>>
>>
>> *::DISCLAIMER::----------------------------------------------------------------------------------------------------------------------------------------------------The
>> contents of this e-mail and any attachments are confidential and intended
>> for the named recipient(s) only.E-mail transmission is not guaranteed to be
>> secure or error-free as information could be intercepted, corrupted,lost,
>> destroyed, arrive late or incomplete, or may contain viruses in
>> transmission. The e mail and its contents(with or without referred errors)
>> shall therefore not attach any liability on the originator or redBus.com.
>> Views or opinions, if any, presented in this email are solely those of the
>> author and may not necessarily reflect the views or opinions of redBus.com.
>> Any form of reproduction, dissemination, copying, disclosure,
>> modification,distribution and / or publication of this message without the
>> prior written consent of authorized representative of redbus.
>> <http://redbus.in/>com is strictly prohibited. If you have received this
>> email in error please delete it and notify the sender immediately.Before
>> opening any email and/or attachments, please check them for viruses and
>> other defects.*
>
>

Mime
View raw message