nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Payne <marka...@hotmail.com>
Subject Re: Recovery failure
Date Wed, 06 Sep 2017 17:09:03 GMT
Joe,

If you wanted to go the route of truncating it, I would recommend starting with the
nifi-toolkit-flowfile-repo module and update that. It has the dependencies all already
in place to read the repository and update it. You would want to just read each
transaction from a partition and write it to a new file until you hit the EOFException
and then just discard that transaction.

The other option - not assuming that EOFException implies out of data would mean updating
MinimalLockingWirteAheadLog (in the nifi-commons/nifi-write-ahead-log module) and then
around lines 472-479 updating the logic so that if an Exception is caught there, we call
nextPartition.getNextRecoverableTransactionId() again
if the partition does actually have more data (may require
adding some sort of isRecoveryDataAvailable() method or something
like that on the Partition class).

Does this help?

Thanks
-Mark


On Sep 6, 2017, at 1:01 PM, Joe Gresock <jgresock@gmail.com<mailto:jgresock@gmail.com>>
wrote:

Sorry, 144 was a typo.. there are 14 files.

Yes, it appears to have run out of disk space, so that's probably the root
cause.  Can you give my any ideas on how to carry out your two ideas?  How
would I look for the end of a record, so as to truncate it?

On Wed, Sep 6, 2017 at 4:55 PM, Mark Payne <markap14@hotmail.com<mailto:markap14@hotmail.com>>
wrote:

Hmmm ok interesting... once it hits an EOFException it is assuming that
there is no more data in the partition.
Clearly, there is because it then fails when calling endRecovery(). Did
you perhaps run out of disk space on your FlowFile
Repo while it was running or hit an OutOfMemoryError? Perhaps that would
cause an EOFException and then continue writing.

The fact that there are 144 files in that directory is also very odd...
there is generally only 1-2 files in that directory. Do all of your
partitions have that many files? Any errors before the restart about not
being able to checkpoint the FlowFile Repo?

At this point, I'm not entirely sure what can be done, other than to
perhaps try to manually truncate that last record in the Partition
that is causing the EOFException. Or perhaps the
MinimalLockingWriteAheadLog could be updated to not assume that EOFException
implies that the partition no longer has data in it. Unfortunately,
though, I'm not seeing any easy work around.

On Sep 6, 2017, at 12:37 PM, Joe Gresock <jgresock@gmail.com<mailto:jgresock@gmail.com>>
wrote:

Yes, I do see:
ERROR [main] org.wali.MinimalLockingWriteAheadLog
org.wali.MinimalLockingWriteAheadLog@1e620fe7 unexpectedly reached
End-of-File when reading from Partition-214 for Transaction ID
1918212626;
assuming crash and ignoring this transaction.

In that directory, I see 144 files, totalling ~120MB.  The first two
files
are multi-megabyte files, and the other 12 are all either 7K or 4K.

On Wed, Sep 6, 2017 at 4:30 PM, Mark Payne <markap14@hotmail.com<mailto:markap14@hotmail.com>>
wrote:

Joe,

Any other errors in the logs? Specifically, looking for errors that
contain the text:
unexpectedly reached End-of-File when reading from

or:
unexpectedly found End-of-File when reading from

This is not something that I've ever run into personally, but looking
through the code, trying
to understand what may cause this.

Also, if you look at the files in /data/nifi/flowfile_
repository/partition-8,
how many files are there in there, and how large are they?

Thanks
-Mark



On Sep 6, 2017, at 12:22 PM, Joe Gresock <jgresock@gmail.com<mailto:jgresock@gmail.com><mailto:jgr
esock@gmail.com<mailto:esock@gmail.com>>> wrote:

1.1.0, it's not on a system I can copy/paste from, but here's part of
the
stack trace:

at
org.wali.MinimalLockingWriteAheadLog$Partition.endRecovery(
MinimalLockingWriteAheadLog.java:1047)
~[nifi-write-ahead-log-1.1.0.jar:1.1.0]
at
org.wali.MinimalLockingWriteAheadLog.recoverFromEdits(
MinimalLockingWriteAheadLog.java:487)
~[nifi-write-ahead-log-1.1.0.jar:1.1.0]
at
org.wali.MinimalLockingWriteAheadLog.recoverRecords(
MinimalLockingWriteAheadLog.java:301)
~[nifi-write-ahead-log-1.1.0.jar:1.1.0]

On Wed, Sep 6, 2017 at 4:13 PM, Mark Payne <markap14@hotmail.com<mailto:markap14@hotmail.com>
<mailto:m
arkap14@hotmail.com<mailto:arkap14@hotmail.com>>> wrote:

Joe,

What version of NiFI are you running? Do you have a stack trace?

Thanks
-Mark


On Sep 6, 2017, at 11:59 AM, Joe Gresock <jgresock@gmail.com<mailto:jgresock@gmail.com><mailto:jgr
esock@gmail.com<mailto:esock@gmail.com>>> wrote:

I'm wondering if there is a way to recover from this scenario:

ERROR [main] o.a.nifi.controller.StandardFlowService Failed to load
flow
from cluster due to: org.apache.nifi.cluster.ConnectionException:
Failed to
connect node to cluster due to: java.lang.IllegalStateException:
Signaled
end to recovery, but there are more recovery files for Partition in
directory /data/nifi/flowfile_repository/partition-8

I have nearly a TB of files in my content_repository, so I'd really like
to
be able to salvage this node, but I'm not sure how to proceed, as the
node
won't start up.

--
I know what it is to be in need, and I know what it is to have plenty.
I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can
do
all this through him who gives me strength.    *-Philippians 4:12-13*




--
I know what it is to be in need, and I know what it is to have plenty.
I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can
do
all this through him who gives me strength.    *-Philippians 4:12-13*




--
I know what it is to be in need, and I know what it is to have plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can
do
all this through him who gives me strength.    *-Philippians 4:12-13*




--
I know what it is to be in need, and I know what it is to have plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can do
all this through him who gives me strength.    *-Philippians 4:12-13*


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message