manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gustavo Beneitez <gustavo.benei...@gmail.com>
Subject Re: web crawler not sharing cookies
Date Fri, 20 Jul 2018 06:12:39 GMT
Hi,

thanks a lot, please let me check then the documentation for an example of
that.

Regards!

El jue., 19 jul. 2018 a las 21:54, Karl Wright (<daddywri@gmail.com>)
escribió:

> You are correct that cookies are not shared among threads.  That is by
> design.
>
> The only way to set cookies for the WebConnector is to have there be a
> "login sequence".  The login sequence sets cookies that are then used by
> all subsequent fetches.
>
> Thanks,
> Karl
>
>
> On Thu, Jul 19, 2018 at 3:38 PM Gustavo Beneitez <
> gustavo.beneitez@gmail.com> wrote:
>
>> Hi everyone,
>>
>> I have tried to look for an answer before writing this email, no luck.
>> Sorry for the inconvenience if it is already answered.
>>
>> I need to set a cookie at the begining of the web crawling. The cookie
>> rules the language you get the content, and while there are several
>> choices, if no cookie is found there will be a "default language".
>>
>> I made a JSP which sets the cookie and contains several links (href), and
>> pointed ManifoldCF to this page as the repository seed. I expected to get
>> the crawling engine starting to capture links with correct language
>> indicated by the cookie, but what I really got is a lot of content shown in
>> default language.
>>
>> What I think about that is that cookies are not shared between thread
>> spiders, so it is not possible to get cookies remain between links. Cookie
>> domain is correct, also cookie expiration
>>
>> I would appreciate so much  if you can help me on this.
>>
>> Thanks in advance!
>>
>>
>>

Mime
View raw message