manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: PostgreSQL version to support MCF v2.10
Date Tue, 04 Sep 2018 17:22:29 GMT
THanks for the update.
Lower-casing the ID would be fine except there are some connectors that
care about case.  The web connector is one such because it's up to the web
service to decide if case matters, so the web connector does not view urls
with case differences as being the same.  Other connectors also will likely
care as well. So I don't think lower-casing the document id is a smart
thing to do.

You could add this bit of configuration to the web connector, if that's
what you are using, or to whatever other connector constructs the ID.

Karl



On Tue, Sep 4, 2018 at 12:04 PM Steph van Schalkwyk <steph@remcam.net>
wrote:

> Thanks Karl.
>
> I'll look into that.
>
> Another note:
> Regarding the ES connector - I have made two additions to it and should
> probably diff them for inclusion after approval:
> 1. lowercased _id (the doc URI).
> 2. Removed dual "/" , e.g. "//" in the _id (I have sloppy sources,
> particularly IIS...)
> 3. Added a "url" metadata field to the ES connector (as ES 6.x does not
> allow accedd to _id in the schema anymore, so no copy_field etc. from _id).
> Hence "url".
>
> Regards,
> Steph
>
>
>
>
> *Steph van Schalkwyk*
> Principal, Remcam Search Engines
> +1.314.452. <+1+314+452+2896>2896    steph@remcam.net   http://remcam.net
> <http://www.remcam.net/> Skype: svanschalkwyk
> <https://mail.google.com/mail/u/0/#>
> <http://linkedin.com/in/vanschalkwyk>
>
> On Tue, Sep 4, 2018 at 10:50 AM, Karl Wright <daddywri@gmail.com> wrote:
>
>> Hi Steph, I suspect that Jetty is leaking some resource, and we may need
>> to upgrade it.
>>
>> Karl
>>
>>
>> On Tue, Sep 4, 2018 at 11:26 AM Steph van Schalkwyk <steph@remcam.net>
>> wrote:
>>
>>> Olivier
>>> By all means.
>>> The only issue I have seen (totally unrelated) is with Jetty, which has
>>> to be restarted about once a week. Still trying to find the issue.
>>> I may be overly sensitive, but I suspect MCF 2.10 with Postgres10 may be
>>> a bit slower. I have no empiric evidence at the moment as I'm still
>>> delivering the project to UAT. Will keep you posted.
>>> Regards,
>>> Steph
>>>
>>>
>>>
>>> *Steph van Schalkwyk*
>>> Principal, Remcam Search Engines
>>> +1.314.452. <+1+314+452+2896>2896    steph@remcam.net
>>> http://remcam.net <http://www.remcam.net/> Skype: svanschalkwyk
>>> <https://mail.google.com/mail/u/0/#>
>>> <http://linkedin.com/in/vanschalkwyk>
>>>
>>> On Tue, Sep 4, 2018 at 9:59 AM, Olivier Tavard <
>>> olivier.tavard@francelabs.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> Thanks a lot for sharing your PostgreSQL configuration (sorry for the
>>>> late answer). I will test it soon.
>>>>
>>>> Best regards,
>>>>
>>>>
>>>> Olivier TAVARD
>>>>
>>>>
>>>> Le 23 août 2018 à 19:20, Steph van Schalkwyk <steph@remcam.net> a
>>>> écrit :
>>>>
>>>>
>>>>
>>>> These are the rpm installs:
>>>> - file:///tmp/postgres10/postgresql10-libs-10.4-1PGDG.rhel7.x86_64.rpm
>>>> - file:///tmp/postgres10/postgresql10-10.4-1PGDG.rhel7.x86_64.rpm
>>>> -
>>>> file:///tmp/postgres10/postgresql10-contrib-10.4-1PGDG.rhel7.x86_64.rpm
>>>> - file:///tmp/postgres10/postgresql10-devel-10.4-1PGDG.rhel7.x86_64.rpm
>>>> -
>>>> file:///tmp/postgres10/postgresql10-server-10.4-1PGDG.rhel7.x86_64.rpm
>>>>
>>>> postgresql_version: 10
>>>> postgresql_data_dir: /var/lib/pgsql/10/data
>>>> postgresql_bin_path: /usr/pgsql-10/bin
>>>> postgresql_config_path: /var/lib/pgsql/10/data
>>>> postgresql_daemon: postgresql-10.service
>>>> postgresql_packages:
>>>> - postgresql10-libs
>>>> - postgresql10
>>>> - postgresql10-server
>>>> - postgresql10-contrib
>>>> # - postgresql10-devel
>>>>
>>>> postgresql_hba_entries:
>>>> - { type: local, database: all, user: postgres, auth_method: peer }
>>>> - { type: local, database: all, user: all, auth_method: peer }
>>>> - { type: host, database: all, user: all, address: '127.0.0.1/32',
>>>> auth_method: md5 }
>>>> - { type: host, database: all, user: all, address: '::1/128',
>>>> auth_method: md5 }
>>>> - { type: host, database: all, user: all, address: '0.0.0.0/0',
>>>> auth_method: md5 }
>>>> - { type: host, database: all, user: all, address: '::0/0', auth_method:
>>>> md5 }
>>>>
>>>> postgresql_global_config_options:
>>>> - option: unix_socket_directories
>>>> value: '{{ postgresql_unix_socket_directories | join(",") }}'
>>>>
>>>> - option: standard_conforming_strings
>>>> value: 'on'
>>>>
>>>> - option: shared_buffers
>>>> value: '1024MB'
>>>>
>>>> # max_wal_size = (3 * checkpoint_segments) * 16MB
>>>> # checkpoint_segments=300
>>>> - option: max_wal_size
>>>> value: '14400MB'
>>>>
>>>> - option: min_wal_size
>>>> value: '80MB'
>>>>
>>>> - option: maintenance_work_mem
>>>> value: '2MB'
>>>>
>>>> - option: listen_addresses
>>>> value: '*'
>>>>
>>>> - option: max_connections
>>>> value: '400'
>>>>
>>>> - option: checkpoint_timeout
>>>> value: '900'
>>>>
>>>> - option: datestyle
>>>> value: "iso, mdy"
>>>>
>>>> - option: autovacuum
>>>> value: 'off'
>>>>
>>>> # vacuum all databases every night (full vacuum on Sunday night, lazy
>>>> vacuum every night)
>>>> - name: add postgresql cron lazy vacuum
>>>> cron:
>>>> name: lazy_vacuum
>>>> hour: 8
>>>> minute: 0
>>>> job: "su - postgres -c 'vacuumdb --all --analyze --quiet'"
>>>> - name: add postgresql cron full vacuum
>>>> cron:
>>>> name: full_vacuum
>>>> weekday: 0
>>>> hour: 10
>>>> minute: 0
>>>> job: "su - postgres -c 'vacuumdb --all --full --analyze --quiet'"
>>>> # re-index all databases once a week
>>>> - name: add postgresql cron reindex
>>>> cron:
>>>> name: reindex
>>>> weekday: 0
>>>> hour: 12
>>>> minute: 0
>>>> job: "su - postgres -c 'psql -t -c \"select datname from pg_database
>>>> order by datname;\" | xargs -n 1 -I\"{}\" -- psql -U postgres {} -c
>>>> \"reindex database {};\"' "
>>>>
>>>>
>>>> This is how I run 2.10.
>>>> Been running fine for some weeks without user intervention.
>>>> @Karl: Any comments please?
>>>> Steph
>>>>
>>>>
>>>>
>>>>
>>>
>

Mime
View raw message