manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jetnet <jet...@gmail.com>
Subject Session-based authentication
Date Thu, 07 Jul 2016 08:35:12 GMT
Hi All,

I've been trying to setup a session-based auth sequence for a forked
MediaWiki site (Wiki connector does not work with this version), but
somehow got stuck with the configuration.
The idea is to index the site using its sitemap.xml with hops=1. The
"public" version (user not logged in) of the sitemap.xml contains a
different set of links as the "authenticated" one (user logged in).
The current auth sequence looks like this (the job's seeding
URL=http://wikisite/sitemap.xml):

1) the first call to the seeding URL should be redirected to the login page
Login URL regexp: sitemap.xml
Page type: content
Identification regular expression: <some content from the "public" version>
Override target URL: /Special:UserLogin

2) enter user's credentials on the login page
Login URL regexp: Special:UserLogin
Page type: form
Override form parameters: username=someuser, password=******,
returntourl=http://wikisite/sitemap.xml

3) the login page ***should*** redirect back to the seeding URL with
the authorized content
Login URL regexp: /Special:UserLogin
Page type: redirection
Identification regular expression: /sitemap.xml

>From the log-file I can see, that first 2 steps work fine - the public
content gets recognized, the form data get sent, the session's cookies
get set. But the 3rd step returns the "public" version of the
sitemap.xml again, and the login process is getting stuck in a loop.
Am I on the right way or did I miss something?

here is the log for the 3rd step:

 INFO 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: FETCH
LOGIN|http://wikisite/Special:UserLogin|1467838347082+203|302|153|
DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: Tried to
match raw url 'http://wikisite/sitemap.xml'
DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: Tried to
match cooked url 'http://wikisite/sitemap.xml'
DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: Redirection
link lookup matched 'http://wikisite/sitemap.xml'
DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB: Document
'http://wikisite/Special:UserLogin' matches preferred redirection, so
determined to be login page for sequence 'wikisite'
DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: Waiting for
an HttpClient object
DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: For
http://wikisite/sitemap.xml, setting virtual host to wikisite
DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: Got an
HttpClient object after 0 ms.
DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: Get method
for '/sitemap.xml'
DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB: Adding 2
cookies for '/sitemap.xml'
DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB:  Cookie
'[version: 0][name: PHPSESSID][value:
1vnhgi0f84dc9pi6eaoj0nau45][domain: wikisite][path: /][expiry: null]'
added
DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB:  Cookie
'[version: 0][name: authtoken][value:
920_636034351472613318_616a5fd45ce4d5fed6c5318d73b38070][domain:
wikisite][path: /][expiry: Wed Jul 13 22:52:27 CEST 2016]' added
DEBUG 2016-07-06 22:52:35,660 (Worker thread '43') - WEB: Retrieving cookies...
DEBUG 2016-07-06 22:52:35,660 (Worker thread '43') - WEB:   Cookie
'[version: 0][name: PHPSESSID][value:
vqfpr88pqa6d62nl6h4lp03nu1][domain: wikisite][path: /][expiry: null]'
DEBUG 2016-07-06 22:52:35,660 (Worker thread '43') - WEB:   Cookie
'[version: 0][name: authtoken][value:
920_636034351472613318_616a5fd45ce4d5fed6c5318d73b38070][domain:
wikisite][path: /][expiry: Wed Jul 13 22:52:27 CEST 2016]'
 INFO 2016-07-06 22:52:37,004 (Worker thread '43') - WEB: FETCH
LOGIN|http://wikisite/sitemap.xml|1467838347394+9610|200|683773|
DEBUG 2016-07-06 22:52:37,004 (Worker thread '43') - WEB: Document
'http://wikisite/sitemap.xml' is text, with encoding 'utf-8'; link
extraction starting
DEBUG 2016-07-06 22:52:37,019 (Worker thread '43') - WEB: Document
'http://wikisite/sitemap.xml' matches content, so determined to be
login page for sequence 'wikisite'


Thank you!
regards, Konstantin

Mime
View raw message