manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Error: Unexpected jobqueue status - record id X, expecting active status, saw 4 (MySQL compatible Database)
Date Thu, 06 Jun 2019 20:28:54 GMT
I just looked at this briefly.

For
https://gist.githubusercontent.com/schuch/16f617eed064b004cb69a2f929b08ca8/raw/46b9585bf73bb7eaf5e3b28aeb87610525afc9ca/Unexpected%2520jobqueue%2520status%2520-%2520record%2520id%25201522147023170,%2520expecting%2520active%2520status,%2520saw%25204,
look for the following record ID:

1522147023170

This is the one when we look at it after the document is processed
that it has status 4:

>>>>>>

manifoldcf_1     | 2019-06-06T02:01:05.380401275Z
2019-06-06T04:01:05,380 DEBUG [Worker thread '37']
org.apache.manifoldcf.diagnostics: ==== Forensics for record
1522147023170, current status: 4 ====

manifoldcf_1     | 2019-06-06T02:01:05.380512360Z
2019-06-06T04:01:05,380 DEBUG [Worker thread '37']
org.apache.manifoldcf.diagnostics: === Current stack trace ===
manifoldcf_1     | 2019-06-06T02:01:05.380544192Z java.lang.Exception:
Forensics stack trace
manifoldcf_1     | 2019-06-06T02:01:05.380548494Z 	at
org.apache.manifoldcf.crawler.jobs.TrackerClass.printForensics(TrackerClass.java:216)
[mcf-pull-agent.jar:?]
manifoldcf_1     | 2019-06-06T02:01:05.380552564Z 	at
org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:1028)
[mcf-pull-agent.jar:?]
manifoldcf_1     | 2019-06-06T02:01:05.380568210Z 	at
org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:3414)
[mcf-pull-agent.jar:?]

<<<<<<

It expects it to have status 5, though, because the only way the
document could be being worked on at all is if the document had been
put into status 5 by the stuffer thread:

>>>>>>

manifoldcf_1     | 2019-06-06T02:01:05.387210733Z
2019-06-06T04:01:05,381 DEBUG [Worker thread '37']
org.apache.manifoldcf.diagnostics: == About to commit transaction by
'Stuffer thread' at 1559786464972 ==
manifoldcf_1     | 2019-06-06T02:01:05.387250827Z java.lang.Exception:
Precommit stack trace
manifoldcf_1     | 2019-06-06T02:01:05.387256464Z 	at
org.apache.manifoldcf.crawler.jobs.TrackerClass.notePrecommit(TrackerClass.java:108)
~[mcf-pull-agent.jar:?]
manifoldcf_1     | 2019-06-06T02:01:05.387260987Z 	at
org.apache.manifoldcf.crawler.jobs.JobManager.fetchAndProcessDocuments(JobManager.java:3295)
~[mcf-pull-agent.jar:?]
manifoldcf_1     | 2019-06-06T02:01:05.387265034Z 	at
org.apache.manifoldcf.crawler.jobs.JobManager.getNextDocuments(JobManager.java:3014)
~[mcf-pull-agent.jar:?]
manifoldcf_1     | 2019-06-06T02:01:05.387269073Z 	at
org.apache.manifoldcf.crawler.system.StufferThread.run(StufferThread.java:186)
~[mcf-pull-agent.jar:?]
manifoldcf_1     | 2019-06-06T02:01:05.387285663Z
2019-06-06T04:01:05,387 DEBUG [Worker thread '37']
org.apache.manifoldcf.diagnostics:     Transaction includes:
manifoldcf_1     | 2019-06-06T02:01:05.387290117Z
2019-06-06T04:01:05,387 DEBUG [Worker thread '37']
org.apache.manifoldcf.diagnostics:       Record 1522147023170 status
modified to 5: Make active

...

<<<<<<

The actual commit follows a short while later to actually set the
status, and only AFTER that are the documents queued to be handed to
the worker threads.

So what this is saying is that the stuffer thread sets the document
status to 5 and then a worker thread gets it and when the worker
thread is done it tries to set the status back to 4 -- but it's
ALREADY at 4 and that is not possible.

Now, it may be worth examining the possibility that some other job
gets to the document while it is waiting to be queued, and does stuff
to the document's jobqueue record.  There are a couple of background
threads which (for example) reprioritize documents that might
conceivably change the jobqueue record's status.  They generally work
by changing the record status in a transaction to some other value
that would not make the record eligible for queuing, and then changing
it back when done.  But unless completely overlooked, these would,
however, be visible in the diagnostics, and furthermore it should not
be possible for one thread in a transaction to change the status at
the same time as another thread in a transaction changes the status.

So either way I'm forced to conclude that a record gets changed and
committed, but the change is not visible to another thread later
trying to reset the status.

Karl




On Thu, Jun 6, 2019 at 5:58 AM Karl Wright <daddywri@gmail.com> wrote:

> I can look at the log output but not until this weekend.
>
> Karl
>
>
> On Thu, Jun 6, 2019 at 3:59 AM Markus Schuch <
> markus.schuch@deutschebahn.com> wrote:
>
>> Hi Olivier,
>>
>>
>>
>> we were not able to fix this yet.
>>
>>
>>
>> But now we have new diagnostics log data, after the error occurred again
>> yesterday:
>>
>>
>>
>> Unexpected jobqueue status - record id 1522147023170, expecting active
>> status, saw 4
>>
>>
>> https://gist.githubusercontent.com/schuch/16f617eed064b004cb69a2f929b08ca8/raw/46b9585bf73bb7eaf5e3b28aeb87610525afc9ca/Unexpected%2520jobqueue%2520status%2520-%2520record%2520id%25201522147023170,%2520expecting%2520active%2520status,%2520saw%25204
>>
>>
>>
>> Unexpected jobqueue status - record id 1541529864711, expecting active
>> status, saw 4
>>
>>
>> https://gist.githubusercontent.com/schuch/d73672e44b0edd5a9184170c9ee417a5/raw/f55b1c023056ef8e310d99489a0b6f9dfcaaa585/Unexpected%2520jobqueue%2520status%2520-%2520record%2520id%25201541529864711,%2520expecting%2520active%2520status,%2520saw%25204
>>
>>
>>
>> @Karl, maybe you can get some ideas from that log data?
>>
>>
>>
>> Cheers,
>>
>> Markus
>>
>>
>>
>> *Von:* Olivier Tavard <olivier.tavard@francelabs.com>
>> *Gesendet:* Dienstag, 21. Mai 2019 11:00
>> *An:* user@manifoldcf.apache.org
>> *Betreff:* Re: Error: Unexpected jobqueue status - record id X,
>> expecting active status, saw 4 (MySQL compatible Database)
>>
>>
>>
>> Hi Markus,
>>
>>
>>
>> We have the same error (with postgresql database). Did the error occur
>> again since your last mail ?
>>
>> Did you change something on your MCF configuration to fix this ?
>>
>>
>>
>> Thanks,
>>
>> Best regards,
>>
>>
>>
>> Olivier
>>
>>
>>
>>
>>
>> Le 13 févr. 2019 à 13:58, Markus Schuch <markus.schuch@deutschebahn.com>
>> a écrit :
>>
>>
>>
>> Hi Karl,
>>
>>
>>
>> we set the diagnostigs logger to level debug.
>>
>>
>>
>> I will get back when the error occurs again.
>>
>>
>>
>> Cheers,
>>
>> Markus
>>
>>
>> ------------------------------
>>
>>
>> Pflichtangaben anzeigen
>> <http://www.deutschebahn.com/pflichtangaben/20190211>
>>
>> Nähere Informationen zur Datenverarbeitung im DB-Konzern finden Sie hier:
>>  http://www.deutschebahn.com/de/konzern/datenschutz
>>
>>
>>
>> Le 12 févr. 2019 à 17:41, Karl Wright <daddywri@gmail.com> a écrit :
>>
>>
>>
>> Hi Marcus,
>>
>>
>>
>> There's a properties.xml debugging logger you can enable that will keep
>> track of what's happening with transactions, so that when an error of this
>> kinds is reported, information about why the situation is unexpected is
>> dumped to the log.  The logger is called "diagnostics" e.g.
>> "org.apache.manifoldcf.diagnostics".
>>
>>
>>
>> Karl
>>
>>
>>
>>
>>
>> On Tue, Feb 12, 2019 at 10:53 AM Markus Schuch <
>> markus.schuch@deutschebahn.com> wrote:
>>
>> Hi,
>>
>> we are seeing "Error: Unexpected jobqueue status - record id
>> 1484612513829, expecting active status, saw 4" from time to time.
>> I didn’t report it, because we were running on an old MCF version, and
>> there were some bugreports relating to this error that are resolved in
>> newer versions.
>>
>> Now we have upgraded to latest and greatest and the error still occurred.
>> So want to start to track this down.
>>
>> Out setup is:
>>
>>    - ManifoldCF 2.12, running in a Docker Container based on Redhat
>>    Linux, OpenJDK 8
>>    - AWS RDS Database (Aurora MySQL -> 5.6 compatible)
>>    - Single Process Setup
>>
>>
>> We run several Jobs (run once mode) over night with schedule windows and
>> max runtime settings. Some of the time windows are overlapping.
>> At normal days some Jobs finish during their time window, some go to
>> WAITING and will go on in the next night.
>>
>> The error mostly hits at a Sharepoint Repository Job.
>> We have the impression, that the error is somehow related to situations,
>> when service interruptions (e.g. connection issues) occur in other jobs.
>>
>>
>>
>> Maybe all this is related to fundamental problems with the database or
>> the JDBC driver
>>
>> (Karl expressed his concerns about that in the maybe-related
>> https://issues.apache.org/jira/browse/CONNECTORS-1581 ticket)
>>
>> On the other hand, there were similar bugreports in the past concerning
>> unexpected jobqueue status and there were resolved.
>>
>> Mayby there is a chance this can also be analyzed and fixed, but I do not
>> really know where to start.
>>
>>
>> Has anybody ideas how we can track this down / debug this to get to the
>> bottom of the problem?
>> What could be the first step in the analysis?
>>
>> Many thanks in advance
>> Markus
>>
>> May be related bugreports:
>>
>>
>>
>> [0] https://issues.apache.org/jira/browse/CONNECTORS-1395
>> [1] https://issues.apache.org/jira/browse/CONNECTORS-1180
>> [2] https://issues.apache.org/jira/browse/CONNECTORS-590
>> [3] https://issues.apache.org/jira/browse/CONNECTORS-246
>>
>>
>>
>> --
>>
>> Markus Schuch
>>
>> Web Business (T.IPB 26)
>>
>>
>>
>> DB Systel GmbH
>>
>> Jürgen-Ponto-Platz 1, 60329 Frankfurt a. Main
>>
>>
>> ------------------------------
>>
>>
>> Pflichtangaben anzeigen
>> <http://www.deutschebahn.com/pflichtangaben/20190211>
>>
>> Nähere Informationen zur Datenverarbeitung im DB-Konzern finden Sie hier:
>> http://www.deutschebahn.com/de/konzern/datenschutz
>>
>>
>>
>>
>> ------------------------------
>>
>> Pflichtangaben anzeigen
>> <http://www.deutschebahn.com/pflichtangaben/20190529>
>>
>> Nähere Informationen zur Datenverarbeitung im DB-Konzern finden Sie hier:
>> http://www.deutschebahn.com/de/konzern/datenschutz
>>
>

Mime
View raw message