1/ The database we use is Postgresql version 9.6
2/ I will look at what is happening about the
queries in the logs.
3/ We do a vacuum full analyse every 24 hours,
for each table we adjust the reindex at the value 5000000 (in
properties.xml) with the line :
Is there an instruction that allows to
disable the reindex requested by manifoldcf
(1) What database are you using for this? Some databases
require maintenance periodically or have other heavy usage
(2) Every time a query takes more than an minute to
execute, it is logged, along with the query plan. You need to
look at the manifoldcf log to see which queries are
problematic before concluding anything.
(3) For every database table, you can individually
configure how many table operations approximately occur before
MCF re-analyzes the table. However, it's likely that you have
the opposite problem: a bad query plan for the query that
queues documents for processing. That may mean more frequent
analysis to prevent. But we cannot tell that until we
understand what queries are taking a long time.
We use ManifoldCF v2.10, with postgresql
(9.6) to crawl our websites.
this represents approximately 1.2 million
We split the crawl into 4 jobs that
distribute their results on 3 SOLR collections.
The crawl is powerful up to 500000
documents (25000 to 30000 docs / hour) then the
performance decreases strongly in progress, we observe
freezes very very long, you might think that the crawl
We suspect a reindexing, noticeably of the
intrinsiclink table which is very important 85 Million
Is it possible to prohibit re-indexing
controlled by manifoldCF?
An other idea ?