manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Question about WebcrawlerConnector
Date Wed, 09 Apr 2014 02:52:18 GMT
Hi Hiroshi,

Are both cookies being set at the same time?

The ManifoldCF web connector records *all* the cookies that have been set
at the time the login sequence ends.  So there are two possibilities:

(1) You did not specify all of the login sequence.  You may have missed,
for instance, a last redirection, which sets the second cookie.
(2) There is some sort of problem with Httpclient, or how we configure it,
which prevents it from accepting one of the cookies.  Httpclient has many
different cookie policies; we may need to change the one we use.

If both cookies are set at the same time on the same response, then we know
that the problem is not (1).  So please let me know.

For debugging this on 1.5, the best thing to do is to turn on httpclient
logging of various sorts.  You do this through the ManifoldCF logging.ini
file.  See the section on log4j at:
http://hc.apache.org/httpcomponents-client-4.2.x/logging.html
Wire debugging is very helpful for determining when a cookie has been
transmitted.  If you need to know why a cookie has been rejected, context
logging is helpful.

I'd be happy to look at the logging output and let you know what I think,
if you want to send it to me.

Thanks,
Karl



On Tue, Apr 8, 2014 at 10:31 PM, Hiroshi Tatsumi <
honekichi19@comet.ocn.ne.jp> wrote:

> Hi,
>
> I'm using MCF1.5.1 and Solr4.6.1.
> I have a question about WebcrawlerConnector.
>
>
> [Question] WebcrawlerConnector - Session based access credentials
> Can CookieManager use multiple domain cookies?
> This is the case in my company's intranet Web site.
> When I access to below web site, I need to send two cookies.
>
> Procedure
> access to target URL -> auto redirect to login page -> if success to login,
> auto redirect to the target URL(need two cookies)
>
> Target URL
> https://foo.bar-network.jp/trac/repository/
>
> Cookie domain/name
> (1).bar-network.jp/comauth-req
> (2)hoge.bar-network.jp/JSESSIONID
>
> "Session based access credentials" can simulate this login process.
> But in last auto redirect part, only one cookie is sent.
> So the login procedure is failure. I cannot crawl the target Web site.
>
> (1).bar-network.jp/comauth-req  ->not sent
> (2)hoge.bar-network.jp/JSESSIONID  ->sent
>
> Do you have any idea to success this login procedure?
> Or should I modify MCF source code?
>
> Regards,
> Hiroshi Tatsumi
>

Mime
View raw message