nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Niemiec <josephx...@gmail.com>
Subject Re: [EXT] Re: NiFi Throughput and Slowness
Date Mon, 03 Jul 2017 19:10:27 GMT
Is there a reason your using a RAID-5 Array?

Write speed on RAID-5 is greatly influenced by the controller itself. We
typically do setups using RAID-10 today now for better read/write
performance. The 'parity' bits of RAID5 cause a number of re-writes &
re-reads (4 iops) before it writes out the parity bits to the correct disk.
AKA You get 4 OPS for every 1 write in RAID5, meaning take your IOPS/4 in
the very least.  Don't get me wrong the IOPs provided by the SSD are ample,
but is not something I see anymore in the wild new (RAID5.)

Besides the RAID implementation I would definitely look at what Joe has
recommended.

On Mon, Jul 3, 2017 at 2:30 PM, Joe Witt <joe.witt@gmail.com> wrote:

> Karthik
>
> It is really important to follow best practice configuration for placement
> of repos to your underlying storage.  Configured well you can have hundreds
> of MB/s sustained throughput per node.
>
> Also be sure to take advantage of the record reader / writer capability if
> appropriate for your flow.  Configured well hundreds you can achieve of
> thousands of records per second through a series of enrichments, SQL based
> queries, with transformation all with schema and format awareness while
> many other flows happen at once.
>
> Also, while the ram is awesome that you have the JVM might not be able to
> take advantage of that well in a garbage collection friendly manner.
> Consider dialing that way down to say 8GB.
>
> If you have split text in there be sure it isn't splitting 10s of thousands
> or more records at once.  You can do two phase splits and see much better
> behavior.
>
> With that hardware sustained performance can be extremely high.  I'd say do
> a few raid1 instead of raid5.  You can then partition the various
> repositories of nifi to minimize the use of the same physical device and
> maximize throughput and response time.   You'll also prob need 10Gb
> NICs/network.
>
> And clustering is a powerful feature.  I'd avoid doing that until you have
> the right fundamentals at play in a single node and and see both the
> sustained throughout and transaction rate you'd expect.
>
> Thanks
> Joe
>
>
>
>
>
> On Jul 3, 2017 1:18 PM, "Karthik Kothareddy (karthikk) [CONT - Type 2]" <
> karthikk@micron.com> wrote:
>
> Rick,
>
> Thanks a lot for the suggestion, clustering is something that even I was
> thinking of for a long time. Just wanted to see if anyone in the community
> have similar problems and solutions they found.
>
> -Karthik
>
> -----Original Message-----
> From: Richard St. John [mailto:rstjohn67@gmail.com]
> Sent: Monday, July 03, 2017 10:54 AM
> To: dev@nifi.apache.org; dev@nifi.apache.org
> Subject: [EXT] Re: NiFi Throughput and Slowness
>
> Hi there,
>
> In the beginning of our NiFi adoption, we faced similar issues. For us, we
> clustered NiFi, limited the number of concurrent tasks for each processor
> and added more logical partitions for content and provenance repositories.
> Now, we easily processor million of flow files per minute on a 5-node
> cluster with hundreds of processors in the data flow pipeline. When we need
> to ingest more data or process it faster, we simply add more nodes.
>
> First and foremost, clustering NiFi allows horizontal scaling: a must. It
> seems counterintuitive, but limiting the number of concurrent tasks was a
> major performance improvement. Doing so keeps the flow "balanced",
> preventing hotspots within the flow pipeline.
>
> I hope this helps
>
> Rick.
>
> --
> Richard St. John, PhD
> Asymmetrik
> 141 National Business Pkwy, Suite 110
> Annapolis Junction, MD 20701
>
> On Jul 3, 2017, 12:53 PM -0400, Karthik Kothareddy (karthikk) [CONT - Type
> 2] <karthikk@micron.com>, wrote:
> > All,
> >
> > I am currently using NiFi 1.2.0 on a Linux (RHEL) machine. I am using a
> single instance without any clustering. My machine has ~800GB of RAM and
> 2.5 TB of disk space (SSD’s with RAID5). I have set my Java heap space
> values to below in “bootstrap.conf” file
> >
> > # JVM memory settings
> > java.arg.2=-Xms40960m
> > java.arg.3=-Xmx81920m
> >
> > # Some custom Configurations
> > java.arg.7=-XX:ReservedCodeCacheSize=1024m
> > java.arg.8=-XX:CodeCacheMinimumFreeSpace=10m
> > java.arg.9=-XX:+UseCodeCacheFlushing
> >
> > Now, the problem that I am facing when I am stress testing this instance
> is whenever the Read/Write of Data feeds reach the limit of 5GB (at least
> that’s what I observed) the whole instance is running super slow meaning
> the flowfiles are moving very slow in the queues. It is heavily affecting
> the other Processor groups as well which are very simple flows. I tied to
> read the system diagnostics at that point and see that all the usage is
> below 20% including heap Usage, flowFile and content repository usage. I
> tried to capture the status history of the Process Group at that particular
> point and below are some results.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > From the above images it is obvious that the process group is working on
> lot of IO at that point. Is there a way to increase the throughput of the
> instance given my requirement which has tons of read/writes every hour.
> Also to add all my repositories (flowfile , content and provenance) are on
> the same disk. I tried to increase all the memory settings I possibly can
> in both bootstrap.conf and nifi.properties , but no use the whole instance
> is running very slow and is processing minimum amount of flowfiles. Just to
> make sure I created a GenerateFlowfile processor when the system is slow
> and to my surprise the rate of flow files generated is less that one per
> minute (which should fill the queue in less than 5 secs under normal
> circumstances). Any help on this would be much appreciated.
> >
> >
> > Thanks
> > Karthik
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>



-- 
Joseph

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message