manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Schuch <markus.sch...@deutschebahn.com>
Subject AW: Amazon RDS for PostgreSQL Support
Date Thu, 30 Nov 2017 07:33:20 GMT
Hi Karl,

we disabled autovacuum
The stats table show there was no autovac since then.

The long running queries still occur.
There are no other apps using the database and no other jobs running.

But there is another long running query to the jobs table between the carrydown queries.
This query seem to happen at the same time with the long running carrydown query.

The plan output says “Sec Scan on jobs…”.

2017-11-30 08:16:16,008 WARN  [Finisher thread] org.apache.manifoldcf.db: Found a long-running
query (169057 ms): [SELECT id FROM jobs WHERE  (status=? OR status=? OR status=?) FOR UPDATE]
2017-11-30 08:16:16,009 WARN  [Finisher thread] org.apache.manifoldcf.db:   Parameter 0: 'A'
2017-11-30 08:16:16,009 WARN  [Finisher thread] org.apache.manifoldcf.db:   Parameter 1: 'W'
2017-11-30 08:16:16,009 WARN  [Finisher thread] org.apache.manifoldcf.db:   Parameter 2: 'R'
2017-11-30 08:16:16,011 WARN  [Finisher thread] org.apache.manifoldcf.db:  Plan: LockRows
 (cost=0.00..4.43 rows=3 width=14) (actual time=0.022..0.024 rows=1 loops=1)
2017-11-30 08:16:16,011 WARN  [Finisher thread] org.apache.manifoldcf.db:  Plan:   -> 
Seq Scan on jobs  (cost=0.00..4.40 rows=3 width=14) (actual time=0.021..0.022 rows=1 loops=1)
2017-11-30 08:16:16,011 WARN  [Finisher thread] org.apache.manifoldcf.db:  Plan:         Filter:
((status = 'A'::bpchar) OR (status = 'W'::bpchar) OR (status = 'R'::bpchar))
2017-11-30 08:16:16,011 WARN  [Finisher thread] org.apache.manifoldcf.db:  Plan:         Rows
Removed by Filter: 22
2017-11-30 08:16:16,011 WARN  [Finisher thread] org.apache.manifoldcf.db:  Plan: Planning
time: 0.093 ms
2017-11-30 08:16:16,011 WARN  [Finisher thread] org.apache.manifoldcf.db:  Plan: Execution
time: 0.041 ms
2017-11-30 08:16:16,011 WARN  [Finisher thread] org.apache.manifoldcf.db:

Jobs table indices:
public    jobs       jobs_pkey                          CREATE UNIQUE INDEX jobs_pkey ON jobs
USING btree (id)
public    jobs       i1511464305264                               CREATE INDEX i1511464305264
ON jobs USING btree (status, id, priority)
public    jobs       i1511464305263                               CREATE INDEX i1511464305263
ON jobs USING btree (status, processid)
public    jobs       i1511464305262                               CREATE INDEX i1511464305262
ON jobs USING btree (connectionname)
public    jobs       i1511464305261                               CREATE INDEX i1511464305261
ON jobs USING btree (failtime)

Shouldn’t be i1511464305264 be used?

Many thanks in advance
Markus

Von: Karl Wright [mailto:daddywri@gmail.com]
Gesendet: Mittwoch, 29. November 2017 23:28
An: user@manifoldcf.apache.org
Betreff: Re: Amazon RDS for PostgreSQL Support

The plans look good for the carrydown execution, so I have to conclude that the long-running
queries are due to other considerations -- perhaps concurrent vacuuming, perhaps other queries
pounding the database.

Sharepoint is especially hard on the carrydown table -- it gets huge.

Karl


On Wed, Nov 29, 2017 at 4:46 PM, Markus Schuch <markus.schuch@deutschebahn.com<mailto:markus.schuch@deutschebahn.com>>
wrote:
Hi,

since nobody responded, we started to experiment.

Setup:
AWS RDS Postgres 9.6.3
db.m4.xlarge (4 cores, 16 GB RAM)
provisioned iops 4000

Indexing Performance for most jobs is pretty good, except a for a bigger sharepoint crawl
(~200.000 docs).

We are seeing a lot of long running queries for the tables carrydown here.

Logfile:
https://gist.githubusercontent.com/schuch/b553dc3f825e0c640a52ce3f530ec3e0/raw/34e493ef520d2272b25f7d9d8acc3f8417550c17/gistfile1.txt

Settings:
https://gist.githubusercontent.com/schuch/44fcbb312df37ab89a910024a56ff835/raw/eba6ff9bc82026f181d5687584e9dd99cdffb242/postgresql%2520RDS%2520settings
(due to this discussion we left autovac on: https://www.mail-archive.com/user@manifoldcf.apache.org/msg03512.html)

CPU Usage of the Database is between 15% and 35%

Are we hitting bad auto optimiziations of the newer postgresql version?

Many thanks in advance,
Markus

Mime
View raw message