nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James McMahon <jsmcmah...@gmail.com>
Subject Re: Cannot Restart Nifi
Date Tue, 28 Mar 2017 12:08:54 GMT
Thank you Joe. The command ulimit -a tells me that my open files are
limited to 1024. So I am not currently allowing more than that as you
indicated. Aldrin referred me to the sys admin "best practices", which
seems to call for 50000. So I need to get with the sys administrator of
this box this morning and address this deficiency.

If I crank that up to 50000 and then restart my box, is it possible I will
then be able to start NiFi *without *having to blow away my flowfile
repository? Because if that is the case, then that is the much preferred
path I will follow.  -Jim

On Tue, Mar 28, 2017 at 7:50 AM, Joe Witt <joe.witt@gmail.com> wrote:

> It would mean lost data.  It should not he necessary.
>
> As far as system config changes and specifically open file handles this
> one is very important.  Run 'ulimit -a' and see what it says for open file
> handles.  It must be larger than 1024.
>
> On Mar 28, 2017 7:46 AM, "James McMahon" <jsmcmahon3@gmail.com> wrote:
>
> Hi Aldrin. Yes sir, of course: my environment is NiFi v0.7. I have my
> content, flowfile, and provenance repositories on separate independent disk
> devices. In my nifi.properties file, nifi.flowfile.repository.partitions
> equals 256, and always.sync is false. My nifi.queue.swap.threshold is
> 20000. Since I am currently in development and so this is not a production
> process, I have set nifi.flowcontroller.autoResumeState to false.
> In conf/bootstrap.conf, my JVM memory settings are -Xms1024m and -Xmx4096m.
>
> In fact I have not yet applied the best practices from the Sys Admin
> Guide. I will speak with them about doing this today. I am a little
> hesitant to just jump into making the seven system changes you detail. NiFi
> does run on this box, but so do other processed that may be impacted.
> what's good for NiFi may not be good for these other processes, and so I
> want to ask first.
>
> My scripts employ a Python stream callback to grab values from select
> attributes, populate those into a Python dictionary object, generate a json
> object from that dictionary object, and replace the flowfile contents with
> that dictionary object. These scripts are called by ExecuteScript
> processors. Similar scripts are used at various points throughout my
> workflow, near the end of each branch. Those had been working without any
> problems until I tried to introduce Python logging yesterday. I suspect I
> am not releasing file handler resources and logger objects as flowfiles
> flow through these ExecuteScript processors - maybe? I really am only
> making educated guesses at this stage. My first objective today is to get
> NiFi to come back up.
>
> Please tell me: while I am in a dev state right now, had I been in a
> production state what would have been the repercussions of deleting in its
> entirety the flowfile_repository, which includes all its journal files?
>
> Thanks very much in advance for your help.
>
> Jim
>
> On Tue, Mar 28, 2017 at 6:57 AM, Aldrin Piri <aldrinpiri@gmail.com> wrote:
>
>> Hi Jim,
>>
>> In getting to the root cause, could you please provide information on
>> your environment?  Did you apply the best practices listed in the System
>> Administrator's guide?  Could you provide some details on what your scripts
>> are doing?
>>
>> If the data is not of importance, removing the Flowfile Repo should get
>> you going. You can additionally remove the content repo, but this should be
>> cleaned up by the framework as no flowfiles will point to said content.
>>
>>
>> Aldrin Piri
>> Sent from my mobile device.
>>
>> On Mar 28, 2017, at 06:12, James McMahon <jsmcmahon3@gmail.com> wrote:
>>
>> I noticed, too, that I have many partitions, partition-0 to partition-255
>> to be exact. These all have journal files in them. So I suspect that the
>> journal file I cited is not specifically the problem in and of itself, but
>> instead is the point where the allowable open files threshold is reached.
>> I'm wondering if I have to recover by deleting all these partitions? -Jim
>>
>> On Tue, Mar 28, 2017 at 5:58 AM, James McMahon <jsmcmahon3@gmail.com>
>> wrote:
>>
>>> While trying to use Python logging from two scripts I call via two
>>> independent ExecuteScript processors, I seem to have inadvertently created
>>> a condition where I have too many files open. This is causing a serious
>>> challenge for me, because when I attempt to start nifi (v0.7.1) it fails.
>>>
>>> The log indicates that the flow controller cannot be started, and it
>>> cites the cause as this:
>>> org.apache.nifi.web.NiFiCoreException: Unable to start Flow Controller
>>> .
>>> . (many stack trace entries)
>>> .
>>> Caused by: java.nio.file.FileSystemException:
>>> /mnt/flow_repo/flowfile_repository/partition-86/83856.journal: Too many
>>> files open
>>>
>>> In a situation like this, what is the best practice for recovery? Is it
>>> permissible to simply delete this journal file? What are the negative
>>> repercussions of doing that?
>>>
>>> I did already try deleting my provenance_repository, but that did not
>>> allow nifi to restart. (NiFi did re-establish my provenance_repository at
>>> restart).
>>>
>>> Thanks very much in advance for your help. -Jim
>>>
>>
>>
>
>

Mime
View raw message