Hi Ahmet,

Sorry, googlemail has bug and keeps sending my mail before I am ready.

First, the following error indicates that a transaction should be retried:

org.apache.manifoldcf.core.
interfaces.ManifoldCFException: Database exception: SQLException doing query (40001): ERROR: could not serialize access due to read/write dependencies among transactions

The code to retry is already there, as is the code in the DBInterfacePostgresql.java class to catch the exception.  But where this is happening is actually trying to print out the EXPLAIN for a long-running query - and I don't think we've ever seen an EXPLAIN take such a long time before.

The second error occurs because the transaction has been aborted by Postgresql but ManifoldCF isn't yet aware of it.  When ManifoldCF sees a database error it does not know, it tries to reset all connections.  This logic may or may not work properly; I have seen it hang before, however.

So I think what has happened is: (a) you had a really long running "addDocuments()" transaction, and (b) it was so long that it tried to print an EXPLAIN for it, and (c) that failed.  Then the reset logic hung ManifoldCF.

So there are two bugs here:
- Reset logic hangs manifoldCF sometimes
- EXPLAIN may require retry

Can you create tickets for both of these?

Thanks,

Karl



On Mon, Jun 24, 2013 at 8:05 AM, Karl Wright <daddywri@gmail.com> wrote:
Hi Ahmet,

Several things are happening here.

First, the following error indicates that a transaction should be retried:

What is happening is that the database connections are being pooled, and they are


On Mon, Jun 24, 2013 at 7:59 AM, Ahmet Arslan <iorixxx@yahoo.com> wrote:
Hello All,

I hava a MCF 1.2 setup ( with postgresql-9.2) where I crawl some newspaper sites using Web connectors.

I use following setting for jobs:

Maximum hop count for link type 'link': 1
Maximum hop count for link type 'redirect': Unlimited
Hop count mode: No deletes, forever

Start method: Start at beginning of schedule window
Schedule type: Scan every document once
Maximum run time: 90 minutes

I scheduled jobs to run every two hours. However after some crawl hangs. I found these exceptions in the log.

What could be wrong? Any suggestions?

Thanks,
Ahmet

ERROR 2013-06-24 10:39:34,999 (Worker thread '1') - Worker thread aborting and restarting due to database connection reset: Database exception: SQLException doing query (25P02): ERROR: current transaction is aborted, commands ignored until end of transaction block
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Database exception: SQLException doing query (25P02): ERROR: current transaction is aborted, commands ignored until end of transaction block
at org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:717)
at org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:745)
at org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1430)
at org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144)
at org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:186)
at org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:822)
at org.apache.manifoldcf.crawler.jobs.JobManager.addDocuments(JobManager.java:4148)
at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.processDocumentReferences(WorkerThread.java:2017)
at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.flush(WorkerThread.java:1948)
at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:562)
Caused by: org.postgresql.util.PSQLException: ERROR: current transaction is aborted, commands ignored until end of transaction block
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:273)
at org.apache.manifoldcf.core.database.Database.execute(Database.java:862)
at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:677)
ERROR 2013-06-24 10:39:33,473 (Worker thread '1') - Explain failed with error Database exception: SQLException doing query (40001): ERROR: could not serialize access due to read/write dependencies among transactions
  Detail: Reason code: Canceled on identification as a pivot, during conflict out checking.
  Hint: The transaction might succeed if retried.
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Database exception: SQLException doing query (40001): ERROR: could not serialize access due to read/write dependencies among transactions
  Detail: Reason code: Canceled on identification as a pivot, during conflict out checking.
  Hint: The transaction might succeed if retried.
at org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:717)
at org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:745)
at org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.explainQuery(DBInterfacePostgreSQL.java:1233)
at org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1449)
at org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144)
at org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:186)
at org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:822)
at org.apache.manifoldcf.crawler.jobs.JobManager.addDocuments(JobManager.java:4148)
at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.processDocumentReferences(WorkerThread.java:2017)
at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.flush(WorkerThread.java:1948)
at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:562)
Caused by: org.postgresql.util.PSQLException: ERROR: could not serialize access due to read/write dependencies among transactions
  Detail: Reason code: Canceled on identification as a pivot, during conflict out checking.
  Hint: The transaction might succeed if retried.
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:273)
at org.apache.manifoldcf.core.database.Database.execute(Database.java:862)
at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:677)