nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Jackoway <al...@cloudera.com>
Subject Re: Content Repository Growing Too Large
Date Wed, 13 Sep 2017 12:12:16 GMT
Hello again,

Sorry for disappearing on this thread. My email was filtering the nifi list
for some reason :(

I am looking into both of these possibilities - There was a stretch of time
where we were getting a lot of OOMEs, but they seem to mostly be associated
with one of our processors, not a nifi cleanup task. They also predate this
morning's out of disk space by about a week.

I will see what I can do about the small flowfiles - I didn't realize that
a flowfile being in progress could keep another one from being cleaned up.
I also didn't realize that a flowfile being in progress could affect the
size of the content repository - I expected it to only affect the size of
the flowfile repository.

This is happening again (which reminded me of the email). Right now our
content repository is 905GB and flowfile repository is 238MB. Flow on that
node is: 99,069 / 20.19 GB

I'll let you know if I find anything interesting as I try to clean it up.

Thanks,
Alan

On Sat, Sep 2, 2017 at 3:16 PM, Mark Payne <markap14@hotmail.com> wrote:

> Hey Alan,
>
> Any chance that you're seeing any OutOfMemory errors or anything like
> that? Any error logs about FlowFile Repository?
>
> Can you check the total size of your FlowFile Repository?
>
> The reason that I ask is that someone was reporting a problem the other
> day about an OOME preventing the FlowFile Repo from checkpointing. As a
> result, if this were to happen, it would result in Content Claims not being
> properly cleaned up.
>
> Thanks
> -Mark
>
> Sent from my iPhone
>
> > On Sep 1, 2017, at 5:24 PM, Alan Jackoway <alanj@cloudera.com> wrote:
> >
> > Hello,
> >
> > We have had issues with a few nifi instances recently where the content
> repository grew too large and we couldn't get nifi to clean it up.
> >
> > In our current instance of this, we added a second disk to keep nifi
> from consuming 100% of the available space, and now it is close to
> consuming all space on both disks.
> >
> > Our nifi.properties looks like this:
> > # Content Repository
> > nifi.content.repository.implementation=org.apache.
> nifi.controller.repository.FileSystemRepository
> > nifi.content.claim.max.appendable.size=10 MB
> > nifi.content.claim.max.flow.files=100
> > nifi.content.repository.directory.default=/data/3/
> nifi_storage/edh-production-ingestion/content_repository
> > nifi.content.repository.directory.content2=/data/5/
> nifi_storage/edh-production-ingestion/content_repository
> > nifi.content.repository.archive.max.retention.period=1 hours
> > nifi.content.repository.archive.max.usage.percentage=50%
> > nifi.content.repository.archive.enabled=false
> > nifi.content.repository.always.sync=false
> > nifi.content.viewer.url=/nifi-content-viewer/
> >
> > Each of those content_repository directories are now consuming about
> 750GB on their disks, which are 1TB. The nifi's queue/size is 53,084 / 3.42
> MB but it does process a fairly large amount of data.
> >
> > How do we keep nifi from consuming our entire disks with content
> repository? We turned off archive enabled in the hope that it would bring
> the content repository size down to close to 0, but that has not worked for
> us.
> >
> > Thanks,
> > Alan
> >
> > PS. Sorry for the Friday afternoon email. That's when disks always want
> to get full.
>

Mime
View raw message