manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Amazon RDS for PostgreSQL Support
Date Sat, 16 Dec 2017 16:40:42 GMT
Yes, that looks right.

Karl


On Sat, Dec 16, 2017 at 10:21 AM, Markus Schuch <markus_schuch@web.de>
wrote:

> would this be the right place?
> https://cwiki.apache.org/confluence/display/CONNECTORS/FAQ
>
> Am 16.12.2017 um 15:08 schrieb Karl Wright:
> > We should write up your findings in a FAQ if you find MySQL to perform
> > better than postgresql.
> >
> > My guess is that what you are seeing is less "bad plan", more
> > "contention and backoff", but that would take some effort to tease out.
> >
> > Karl
> >
> > On Fri, Dec 15, 2017 at 3:06 PM, Karl Wright <daddywri@gmail.com
> > <mailto:daddywri@gmail.com>> wrote:
> >
> >     Interesting.  Thanks for the update.
> >     Karl
> >
> >     On Fri, Dec 15, 2017 at 2:31 PM, Markus Schuch
> >     <markus.schuch@deutschebahn.com
> >     <mailto:markus.schuch@deutschebahn.com>> wrote:
> >
> >         We were no able to resolve the performance issue with the
> >         carrydown table.____
> >
> >         __ __
> >
> >         We switched to RDS Aurora MySQL, which performs similar to the
> >         MariaDB database in our on premise datacenter. No problems with
> >         the carrydown table queries of the sharepoint connector.____
> >
> >         __ __
> >
> >         Regards____
> >
> >         Markus____
> >
> >         __ __
> >
> >         *Von:* Karl Wright [mailto:daddywri@gmail.com
> >         <mailto:daddywri@gmail.com>]
> >         *Gesendet:* Donnerstag, 30. November 2017 12:23
> >
> >
> >         *An:* user@manifoldcf.apache.org <mailto:user@manifoldcf.
> apache.org>
> >         *Betreff:* Re: Amazon RDS for PostgreSQL Support____
> >
> >         __ __
> >
> >         Typically Jobs table is short and sequential scans are faster
> >         than index joins.  Postgres optimizes for that.____
> >
> >         __ __
> >
> >         No, the plans look fine.  Another reason for the long-running
> >         queries might well be contention and locking -- many threads
> >         will be trying to do similar things at the same time.  You will
> >         note that multiple records get updated in one query; this is
> >         usually helpful but when each update is expensive you could wind
> >         up with locking causing delays.____
> >
> >         __ __
> >
> >         Karl____
> >
> >         __ __
> >
> >         __ __
> >
> >         On Thu, Nov 30, 2017 at 2:33 AM, Markus Schuch
> >         <markus.schuch@deutschebahn.com
> >         <mailto:markus.schuch@deutschebahn.com>> wrote:____
> >
> >             Hi Karl,____
> >
> >              ____
> >
> >             we disabled autovacuum____
> >
> >             The stats table show there was no autovac since then.____
> >
> >              ____
> >
> >             The long running queries still occur.____
> >
> >             There are no other apps using the database and no other jobs
> >             running.____
> >
> >              ____
> >
> >             But there is another long running query to the jobs table
> >             between the carrydown queries.____
> >
> >             This query seem to happen at the same time with the long
> >             running carrydown query.____
> >
> >              ____
> >
> >             The plan output says “Sec Scan on jobs…”. ____
> >
> >              ____
> >
> >             2017-11-30 08:16:16,008 WARN  [Finisher thread]
> >             org.apache.manifoldcf.db: Found a long-running query (169057
> >             ms): [SELECT id FROM jobs WHERE  (status=? OR status=? OR
> >             status=?) FOR UPDATE]____
> >
> >             2017-11-30 08:16:16,009 WARN  [Finisher thread]
> >             org.apache.manifoldcf.db:   Parameter 0: 'A'____
> >
> >             2017-11-30 08:16:16,009 WARN  [Finisher thread]
> >             org.apache.manifoldcf.db:   Parameter 1: 'W'____
> >
> >             2017-11-30 08:16:16,009 WARN  [Finisher thread]
> >             org.apache.manifoldcf.db:   Parameter 2: 'R'____
> >
> >             2017-11-30 08:16:16,011 WARN  [Finisher thread]
> >             org.apache.manifoldcf.db:  Plan: LockRows  (cost=0.00..4.43
> >             rows=3 width=14) (actual time=0.022..0.024 rows=1
> loops=1)____
> >
> >             2017-11-30 08:16:16,011 WARN  [Finisher thread]
> >             org.apache.manifoldcf.db:  Plan:   ->  Seq Scan on jobs
> >             (cost=0.00..4.40 rows=3 width=14) (actual time=0.021..0.022
> >             rows=1 loops=1)____
> >
> >             2017-11-30 08:16:16,011 WARN  [Finisher thread]
> >             org.apache.manifoldcf.db:  Plan:         Filter: ((status =
> >             'A'::bpchar) OR (status = 'W'::bpchar) OR (status =
> >             'R'::bpchar))____
> >
> >             2017-11-30 08:16:16,011 WARN  [Finisher thread]
> >             org.apache.manifoldcf.db:  Plan:         Rows Removed by
> >             Filter: 22____
> >
> >             2017-11-30 08:16:16,011 WARN  [Finisher thread]
> >             org.apache.manifoldcf.db:  Plan: Planning time: 0.093 ms____
> >
> >             2017-11-30 08:16:16,011 WARN  [Finisher thread]
> >             org.apache.manifoldcf.db:  Plan: Execution time: 0.041 ms____
> >
> >             2017-11-30 08:16:16,011 WARN  [Finisher thread]
> >             org.apache.manifoldcf.db:____
> >
> >              ____
> >
> >             Jobs table indices:____
> >
> >             public    jobs       jobs_pkey
> >             CREATE UNIQUE INDEX jobs_pkey ON jobs USING btree (id)____
> >
> >             public    jobs
> >             i1511464305264                               CREATE INDEX
> >             i1511464305264 ON jobs USING btree (status, id, priority)____
> >
> >             public    jobs
> >             i1511464305263                               CREATE INDEX
> >             i1511464305263 ON jobs USING btree (status, processid)____
> >
> >             public    jobs
> >             i1511464305262                               CREATE INDEX
> >             i1511464305262 ON jobs USING btree (connectionname)____
> >
> >             public    jobs
> >             i1511464305261                               CREATE INDEX
> >             i1511464305261 ON jobs USING btree (failtime)____
> >
> >              ____
> >
> >             Shouldn’t be i1511464305264 be used?____
> >
> >              ____
> >
> >             Many thanks in advance____
> >
> >             Markus____
> >
> >              ____
> >
> >             *Von:* Karl Wright [mailto:daddywri@gmail.com
> >             <mailto:daddywri@gmail.com>]
> >             *Gesendet:* Mittwoch, 29. November 2017 23:28
> >             *An:* user@manifoldcf.apache.org
> >             <mailto:user@manifoldcf.apache.org>
> >             *Betreff:* Re: Amazon RDS for PostgreSQL Support____
> >
> >              ____
> >
> >             The plans look good for the carrydown execution, so I have
> >             to conclude that the long-running queries are due to other
> >             considerations -- perhaps concurrent vacuuming, perhaps
> >             other queries pounding the database.____
> >
> >              ____
> >
> >             Sharepoint is especially hard on the carrydown table -- it
> >             gets huge.____
> >
> >              ____
> >
> >             Karl____
> >
> >              ____
> >
> >              ____
> >
> >             On Wed, Nov 29, 2017 at 4:46 PM, Markus Schuch
> >             <markus.schuch@deutschebahn.com
> >             <mailto:markus.schuch@deutschebahn.com>> wrote:____
> >
> >                 Hi,____
> >
> >                  ____
> >
> >                 since nobody responded, we started to experiment.____
> >
> >                  ____
> >
> >                 Setup:____
> >
> >                 AWS RDS Postgres 9.6.3____
> >
> >                 db.m4.xlarge (4 cores, 16 GB RAM)____
> >
> >                 provisioned iops 4000____
> >
> >                  ____
> >
> >                 Indexing Performance for most jobs is pretty good,
> >                 except a for a bigger sharepoint crawl (~200.000
> docs).____
> >
> >                  ____
> >
> >                 We are seeing a lot of long running queries for the
> >                 tables carrydown here.____
> >
> >                  ____
> >
> >                 Logfile:____
> >
> >                 https://gist.githubusercontent.com/schuch/
> b553dc3f825e0c640a52ce3f530ec3e0/raw/34e493ef520d2272b25f7d9d8acc3f
> 8417550c17/gistfile1.txt
> >                 <https://gist.githubusercontent.com/schuch/
> b553dc3f825e0c640a52ce3f530ec3e0/raw/34e493ef520d2272b25f7d9d8acc3f
> 8417550c17/gistfile1.txt>____
> >
> >                  ____
> >
> >                 Settings:____
> >
> >                 https://gist.githubusercontent.com/schuch/
> 44fcbb312df37ab89a910024a56ff835/raw/eba6ff9bc82026f181d5687584e9dd
> 99cdffb242/postgresql%2520RDS%2520settings
> >                 <https://gist.githubusercontent.com/schuch/
> 44fcbb312df37ab89a910024a56ff835/raw/eba6ff9bc82026f181d5687584e9dd
> 99cdffb242/postgresql%2520RDS%2520settings>____
> >
> >                 (due to this discussion we left autovac on:
> >                 https://www.mail-archive.com/user@manifoldcf.apache.org/
> msg03512.html
> >                 <https://www.mail-archive.com/
> user@manifoldcf.apache.org/msg03512.html>)____
> >
> >                  ____
> >
> >                 CPU Usage of the Database is between 15% and 35%____
> >
> >                  ____
> >
> >                 Are we hitting bad auto optimiziations of the newer
> >                 postgresql version?____
> >
> >                  ____
> >
> >                 Many thanks in advance,____
> >
> >                 Markus____
> >
> >              ____
> >
> >         __ __
> >
> >
> >
>

Mime
View raw message