manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Session-based authentication
Date Thu, 07 Jul 2016 11:17:43 GMT
Hi Konstantin,

There is an advanced Web Connector integration test, which currently
passes, that tests session login and cookie transmission.  I'll look over
the test to be sure it is complete, but if so you should really be looking
at your login sequence and verifying that the cookie set takes place in a
request that is part of the login sequence.

Thanks,
Karl


On Thu, Jul 7, 2016 at 6:58 AM, jetnet <jetnet@gmail.com> wrote:

> Thanks for the hint regarding the httpclient logging!
> So, it turned out, the cookies do NOT get added to the request:
>
> DEBUG 2016-07-07 12:49:26,015 (Worker thread '4') - WEB: Get method
> for '/sitemap.xml'
> DEBUG 2016-07-07 12:49:26,015 (Worker thread '4') - WEB: Adding 2
> cookies for '/sitemap.xml'
> DEBUG 2016-07-07 12:49:26,015 (Worker thread '4') - WEB:  Cookie
> '[version: 0][name: PHPSESSID][value:
> 8jegbs2dqb6r9oc3mb4pt0q777][domain: wikisite][path: /][expiry: null]'
> added
> DEBUG 2016-07-07 12:49:26,015 (Worker thread '4') - WEB:  Cookie
> '[version: 0][name: authtoken][value:
> 920_636034784213249598_d2f40072be60b4de7bee72d74fc04400][domain:
> wikisite][path: /][expiry: Thu Jul 14 10:53:41 CEST 2016]' added
> DEBUG 2016-07-07 12:49:26,030 (Thread-1214) - CookieSpec selected: standard
> DEBUG 2016-07-07 12:49:26,093 (Thread-1214) - Auth cache not set in the
> context
> DEBUG 2016-07-07 12:49:26,093 (Thread-1214) - Connection request:
> [route: {}->http://wikisite:80][total kept alive: 0; route allocated:
> 0 of 1; total allocated: 0 of 20]
> DEBUG 2016-07-07 12:49:26,140 (Thread-1214) - Connection leased: [id:
> 0][route: {}->http://wikisite:80][total kept alive: 0; route
> allocated: 1 of 1; total allocated: 1 of 20]
> DEBUG 2016-07-07 12:49:26,140 (Thread-1214) - Opening connection
> {}->http://wikisite:80
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - Connecting to
> wikisite/10.0.0.100:80
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - Connection established
> 10.0.0.184:58501<->10.0.0.100:80
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0: set
> socket timeout to 300000
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - Executing request GET
> /sitemap.xml HTTP/1.1
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - Target auth state:
> UNCHALLENGED
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - Proxy auth state:
> UNCHALLENGED
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >> GET
> /sitemap.xml HTTP/1.1
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >>
> User-Agent: Mozilla/5.0 (ApacheManifoldCFWebCrawler;
> email@wikisite.com)
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >> From:
> email@wikisite.com
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >> Accept:
> */*
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >>
> Accept-Encoding: gzip,deflate
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >> Host:
> wikisite
> DEBUG 2016-07-07 12:49:26,155 (Thread-1214) - http-outgoing-0 >>
> Connection: Keep-Alive
> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 << HTTP/1.1
> 200 OK
> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 <<
> Content-Type: application/xml; charset=utf-8
> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 <<
> Server: Microsoft-IIS/7.5
> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 <<
> X-Powered-By: PHP/5.2.14
> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 <<
> Set-Cookie: PHPSESSID=bk9487elppchvshc38c7pfnv01; path=/; HttpOnly
> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 <<
> X-Powered-By: ASP.NET
> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 << Date:
> Thu, 07 Jul 2016 10:49:38 GMT
> DEBUG 2016-07-07 12:49:37,768 (Thread-1214) - http-outgoing-0 <<
> Content-Length: 684207
>
>
> Jira tiket? :)
>
> Thanks,
> Konstantin
>
>
> 2016-07-07 12:37 GMT+02:00 Karl Wright <daddywri@gmail.com>:
> > It really does add cookies as stated.
> >
> > That doesn't mean, however, that the cookies being sent correspond to a
> > session that is correctly logged in.  There's no way to tell this from
> the
> > logs.
> >
> > You can possibly get more information about the back-and-forth by
> enabling
> > httpcomponents/httpclient wire logging.  Headers only should be
> sufficient.
> > You should see the exact cookies and be able to verify that the cookies
> sent
> > are the ones that were returned.  You still won't be able to tell if the
> > login was successful or not.
> >
> > Karl
> >
> >
> >
> > On Thu, Jul 7, 2016 at 6:25 AM, jetnet <jetnet@gmail.com> wrote:
> >>
> >> ok, so, it means, that I do not need the 3rd stage at all? As the
> >> second stage (form authentication) records the cookies and redirects
> >> back:
> >>
> >> the second stage:
> >>
> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB: Post method
> >> for '/Special:UserLogin'
> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB: Post
> >> parameter name 'username' value 'someuser' for '/Special:UserLogin'
> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB: Post
> >> parameter name 'returntourl' value 'http://wikisite/sitemap.xml' for
> >> '/Special:UserLogin'
> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB: Post
> >> parameter name 'password' value 'XXXXXX' for '/Special:UserLogin'
> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB: Adding 2
> >> cookies for '/Special:UserLogin'
> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB:  Cookie
> >> '[version: 0][name: PHPSESSID][value:
> >> bughgf8fbjkkevk79ot4ef2vj1][domain: wikisite][path: /][expiry: null]'
> >> added
> >> DEBUG 2016-07-07 10:52:48,231 (Worker thread '79') - WEB:  Cookie
> >> '[version: 0][name: authtoken][value:
> >> 920_636034352097041592_136c71f2ac1fc2dd1ba72de805fcd1b5][domain:
> >> wikisite][path: /][expiry: Wed Jul 13 22:53:29 CEST 2016]' added
> >> DEBUG 2016-07-07 10:52:48,434 (Worker thread '79') - WEB: Retrieving
> >> cookies...
> >> DEBUG 2016-07-07 10:52:48,434 (Worker thread '79') - WEB:   Cookie
> >> '[version: 0][name: PHPSESSID][value:
> >> 589h3f20tjndhkc391nu5u0u51][domain: wikisite][path: /][expiry: null]'
> >> DEBUG 2016-07-07 10:52:48,434 (Worker thread '79') - WEB:   Cookie
> >> '[version: 0][name: authtoken][value:
> >> 920_636034783686256706_585415102d050458acfd91a9d1f223d5][domain:
> >> wikisite][path: /][expiry: Thu Jul 14 10:52:48 CEST 2016]'
> >>  INFO 2016-07-07 10:52:48,449 (Worker thread '79') - WEB: FETCH
> >> LOGIN|http://wikisite/Special:UserLogin|1467881568231+218|302|153|
> >> DEBUG 2016-07-07 10:52:48,449 (Worker thread '79') - WEB: Document
> >> 'http://wikisite/Special:UserLogin' did not match expected form, link,
> >> redirection, or content for sequence 'wikisite'
> >>
> >> so, the last message means, nothing matches in the sequence anymore -
> >> logon end.
> >> And the last two cookies are being used for the next fetch of the
> >> sitemap, but the its content still matches the public pattern.
> >>
> >> Strange things happen... I just tried to use the authtoken cookie from
> >> the log direct in the browser - and it gets authenticated without
> >> problems: I get the "private" content. But the manifoldcf not...
> >> weird...
> >>
> >> DEBUG 2016-07-07 10:52:48,543 (Worker thread '79') - WEB: Adding 2
> >> cookies for '/sitemap.xml'
> >> DEBUG 2016-07-07 10:52:48,543 (Worker thread '79') - WEB:  Cookie
> >> '[version: 0][name: PHPSESSID][value:
> >> 589h3f20tjndhkc391nu5u0u51][domain: wikisite][path: /][expiry: null]'
> >> added
> >> DEBUG 2016-07-07 10:52:48,543 (Worker thread '79') - WEB:  Cookie
> >> '[version: 0][name: authtoken][value:
> >> 920_636034783686256706_585415102d050458acfd91a9d1f223d5][domain:
> >> wikisite][path: /][expiry: Thu Jul 14 10:52:48 CEST 2016]' added
> >>  INFO 2016-07-07 10:52:58,500 (Worker thread '79') - WEB: FETCH
> >> URL|http://wikisite/sitemap.xml|1467881568543+9957|200|684072|
> >>
> >> size: 684072 - is public content.
> >>
> >> Does it **really** add the cookies to the request? :)
> >>
> >> Thanks!
> >> Konstantin
> >>
> >> 2016-07-07 11:44 GMT+02:00 Karl Wright <daddywri@gmail.com>:
> >> > "I thought, when the auth sequence is done
> >> > (exit login mode), the redirect to the original page happens
> >> > automatically (which is the case here, but somehow the content is
> >> > still "public")."
> >> >
> >> > That is correct BUT if the final redirection is what sets the cookies
> >> > THEN
> >> > the cookies will only be recorded by the web connector if the final
> >> > redirection is part of the login sequence.
> >> >
> >> > Thanks,
> >> > Karl
> >> >
> >> >
> >> > On Thu, Jul 7, 2016 at 5:33 AM, jetnet <jetnet@gmail.com> wrote:
> >> >>
> >> >> hi Karl,
> >> >> thank you for the very prompt feedback!
> >> >>
> >> >> > 1) Have you made sure to include the redirection back to the
> content?
> >> >> This is the step I don't quite understand - could you please clarify
> >> >> how that could be done? I thought, when the auth sequence is done
> >> >> (exit login mode), the redirect to the original page happens
> >> >> automatically (which is the case here, but somehow the content is
> >> >> still "public").
> >> >>
> >> >> > 2) your check for *entering* the login sequence is too broad and
> >> >> > fires
> >> >> > again even though the private sitemap page is being returned.
> >> >> totally agree, that's why the first step is to look into the content
> >> >> of the page, to check, if there is a pattern which appears in the
> >> >> public version ONLY.
> >> >> This is the only solution I can imagine so far, but any ideas - very
> >> >> welcome!
> >> >>
> >> >> The simple history shows basically the same - the process never
> leaves
> >> >> the login stage.
> >> >>
> >> >> If I remove the 3rd step, then I see, that the login stage is over
> >> >> (logon end), but as the content of the sitemap.xml is still "public",
> >> >> the login process kicks in again.
> >> >>
> >> >> Thanks!
> >> >> Konstantin
> >> >>
> >> >> 2016-07-07 11:07 GMT+02:00 Karl Wright <daddywri@gmail.com>:
> >> >> > Hi Konstantin,
> >> >> >
> >> >> > There are two possibilities:
> >> >> >
> >> >> > (1) You have missed one stage when specifying the login sequence.
> >> >> > The
> >> >> > cookies are getting set, but not during a step that's part of
the
> >> >> > login
> >> >> > sequence.  Have you made sure to include the redirection back
to
> the
> >> >> > content?
> >> >> > (2) You really are logging in but your check for *entering* the
> login
> >> >> > sequence is too broad and fires again even though the private
> sitemap
> >> >> > page
> >> >> > is being returned.
> >> >> >
> >> >> > You can also look at the simple history as well to get an idea
what
> >> >> > MCF
> >> >> > is
> >> >> > doing for your job for session handling.
> >> >> >
> >> >> > Thanks,
> >> >> > Karl
> >> >> >
> >> >> >
> >> >> > On Thu, Jul 7, 2016 at 4:35 AM, jetnet <jetnet@gmail.com>
wrote:
> >> >> >>
> >> >> >> Hi All,
> >> >> >>
> >> >> >> I've been trying to setup a session-based auth sequence for
a
> forked
> >> >> >> MediaWiki site (Wiki connector does not work with this version),
> but
> >> >> >> somehow got stuck with the configuration.
> >> >> >> The idea is to index the site using its sitemap.xml with hops=1.
> The
> >> >> >> "public" version (user not logged in) of the sitemap.xml contains
> a
> >> >> >> different set of links as the "authenticated" one (user logged
> in).
> >> >> >> The current auth sequence looks like this (the job's seeding
> >> >> >> URL=http://wikisite/sitemap.xml):
> >> >> >>
> >> >> >> 1) the first call to the seeding URL should be redirected
to the
> >> >> >> login
> >> >> >> page
> >> >> >> Login URL regexp: sitemap.xml
> >> >> >> Page type: content
> >> >> >> Identification regular expression: <some content from the
"public"
> >> >> >> version>
> >> >> >> Override target URL: /Special:UserLogin
> >> >> >>
> >> >> >> 2) enter user's credentials on the login page
> >> >> >> Login URL regexp: Special:UserLogin
> >> >> >> Page type: form
> >> >> >> Override form parameters: username=someuser, password=******,
> >> >> >> returntourl=http://wikisite/sitemap.xml
> >> >> >>
> >> >> >> 3) the login page ***should*** redirect back to the seeding
URL
> with
> >> >> >> the authorized content
> >> >> >> Login URL regexp: /Special:UserLogin
> >> >> >> Page type: redirection
> >> >> >> Identification regular expression: /sitemap.xml
> >> >> >>
> >> >> >> From the log-file I can see, that first 2 steps work fine
- the
> >> >> >> public
> >> >> >> content gets recognized, the form data get sent, the session's
> >> >> >> cookies
> >> >> >> get set. But the 3rd step returns the "public" version of
the
> >> >> >> sitemap.xml again, and the login process is getting stuck
in a
> loop.
> >> >> >> Am I on the right way or did I miss something?
> >> >> >>
> >> >> >> here is the log for the 3rd step:
> >> >> >>
> >> >> >>  INFO 2016-07-06 22:52:27,285 (Worker thread '43') - WEB:
FETCH
> >> >> >> LOGIN|
> http://wikisite/Special:UserLogin|1467838347082+203|302|153|
> >> >> >> DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB:
Tried to
> >> >> >> match raw url 'http://wikisite/sitemap.xml'
> >> >> >> DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB:
Tried to
> >> >> >> match cooked url 'http://wikisite/sitemap.xml'
> >> >> >> DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB:
> >> >> >> Redirection
> >> >> >> link lookup matched 'http://wikisite/sitemap.xml'
> >> >> >> DEBUG 2016-07-06 22:52:27,285 (Worker thread '43') - WEB:
Document
> >> >> >> 'http://wikisite/Special:UserLogin' matches preferred
> redirection,
> >> >> >> so
> >> >> >> determined to be login page for sequence 'wikisite'
> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB:
Waiting
> >> >> >> for
> >> >> >> an HttpClient object
> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB:
For
> >> >> >> http://wikisite/sitemap.xml, setting virtual host to wikisite
> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB:
Got an
> >> >> >> HttpClient object after 0 ms.
> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB:
Get
> method
> >> >> >> for '/sitemap.xml'
> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB:
Adding 2
> >> >> >> cookies for '/sitemap.xml'
> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB:
 Cookie
> >> >> >> '[version: 0][name: PHPSESSID][value:
> >> >> >> 1vnhgi0f84dc9pi6eaoj0nau45][domain: wikisite][path: /][expiry:
> >> >> >> null]'
> >> >> >> added
> >> >> >> DEBUG 2016-07-06 22:52:27,394 (Worker thread '43') - WEB:
 Cookie
> >> >> >> '[version: 0][name: authtoken][value:
> >> >> >> 920_636034351472613318_616a5fd45ce4d5fed6c5318d73b38070][domain:
> >> >> >> wikisite][path: /][expiry: Wed Jul 13 22:52:27 CEST 2016]'
added
> >> >> >> DEBUG 2016-07-06 22:52:35,660 (Worker thread '43') - WEB:
> Retrieving
> >> >> >> cookies...
> >> >> >> DEBUG 2016-07-06 22:52:35,660 (Worker thread '43') - WEB:
  Cookie
> >> >> >> '[version: 0][name: PHPSESSID][value:
> >> >> >> vqfpr88pqa6d62nl6h4lp03nu1][domain: wikisite][path: /][expiry:
> >> >> >> null]'
> >> >> >> DEBUG 2016-07-06 22:52:35,660 (Worker thread '43') - WEB:
  Cookie
> >> >> >> '[version: 0][name: authtoken][value:
> >> >> >> 920_636034351472613318_616a5fd45ce4d5fed6c5318d73b38070][domain:
> >> >> >> wikisite][path: /][expiry: Wed Jul 13 22:52:27 CEST 2016]'
> >> >> >>  INFO 2016-07-06 22:52:37,004 (Worker thread '43') - WEB:
FETCH
> >> >> >> LOGIN|http://wikisite/sitemap.xml|1467838347394+9610|200|683773|
> >> >> >> DEBUG 2016-07-06 22:52:37,004 (Worker thread '43') - WEB:
Document
> >> >> >> 'http://wikisite/sitemap.xml' is text, with encoding 'utf-8';
> link
> >> >> >> extraction starting
> >> >> >> DEBUG 2016-07-06 22:52:37,019 (Worker thread '43') - WEB:
Document
> >> >> >> 'http://wikisite/sitemap.xml' matches content, so determined
to
> be
> >> >> >> login page for sequence 'wikisite'
> >> >> >>
> >> >> >>
> >> >> >> Thank you!
> >> >> >> regards, Konstantin
> >> >> >
> >> >> >
> >> >
> >> >
> >
> >
>

Mime
View raw message