nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Wicks (pwicks)" <pwi...@micron.com>
Subject RE: [EXT] Re: Maximum Memory for NiFi?
Date Mon, 08 Oct 2018 17:02:15 GMT
Bryan,

Our Min is set to 32GB's. Under normal situations the heap does not exceed roughly 50% usage
(out of 70 GB's), and many times is lower. We collect and track these metrics, and in the
last 30 days it's been closer to 35% usage.

But during database maintenance we have to shut down a lot of processors. Flowfile's start
to backup in the system across lots of different feeds. Then, when the database comes back
online, the combined processing of all these separate feeds catching up on backlog (lots of
different processors, not a single processor), causes the heap usage to spike. What we saw
in GC logging was we would reach 70 GB's, then GC would do a stop the world pause and bring
us down to about 65 GB's, then we'd reach 70 GB's, and GC would get us down to 68 GB's. This
would repeat until GC was only trimming off a few MB's and having to run full cleanups every
few seconds; thus leaving the system inoperable.

We brought our cluster back online by:
 1. Shutting everything down
 2. Going into a single node and setting NiFi to not auto-resume state; we also set the maximum
thread count to 10.
 3. We turned on a single node and verified we could process a single feed without crashing.
We then synchronized the flow to the rest of the nodes and brought them back online.
 4. We then manually turned feeds on to flush out backlogged data, of course more data was
backlogging on our edge servers while we did this.
 5. We decided to set threads to 140 per node (significantly lower than the 1500 threads we
used to have), and Heap to 200 GB's. We did 2x threads per virtual core, plus enough threads
to cover all of the site-to-site input ports. It's weird, because NiFi used to happily run
1000+ threads per node all the time, but is able to keep up just as well now with 140 threads...
 6. With these settings in place we caught up on our backlog without running out of heap.
We maxed out around 100 GB's of Heap usage per node.

--Peter

-----Original Message-----
From: Bryan Bende [mailto:bbende@gmail.com] 
Sent: Friday, October 5, 2018 7:26 AM
To: users@nifi.apache.org
Subject: [EXT] Re: Maximum Memory for NiFi?

Generally the larger the heap, the more likely to have long GC pauses.

I'm surprised that you would need a 70GB heap given NiFi's design where the content of the
flow files is generally not held in memory, unless many of the processors you are using are
not written in an optimal way to process the content in a streaming fashion.

Did you initially start out lower than 70GB and head to increase it to that point? Just wondering
what happens at lower levels like maybe 32GB.

On Thu, Oct 4, 2018 at 4:20 PM Peter Wicks (pwicks) <pwicks@micron.com> wrote:
>
> We’ve had some more clustering issues, and found that some nodes are running out of
memory when we have unexpected spikes in data, then we run into a GC stop-the-world event...
We lowered our thread count, and that has allowed the cluster to stabilize for the time being.
>
>
>
> Our hardware is pretty robust, we usually have 1000+ threads running on each node in
the cluster (cumulative ~4,000 threads). Each node has about 500G’s of RAM. But we’ve
only been running NiFi with 70G’s of RAM, and it usually uses only 50G’s.
>
>
>
> I enabled GC logging and after analyzing the data we decided to increase the heap size.
We are experimenting with upping the max to 200G of heap to better absorb spikes in data.
We are using the default G1GC.
>
>
>
> Also, how much impact is there from doing GC logging all the time? The metrics we are
getting are really helpful for debugging/analyzing, but we don’t want to slowdown the cluster
too much.
>
>
>
> Thoughts on issues we might encounter? Things we should consider?
>
>
>
> --Peter
Mime
View raw message