manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erlend Garåsen <e.f.gara...@usit.uio.no>
Subject Re: Ingestion API socket timeout exception waiting for response code
Date Mon, 07 May 2012 11:15:40 GMT

Document deletion works perfectly after I reinstalled the SSL 
certificate and reentered the username and password to our Solr server. 
So I think this issue has been solved.

Erlend

On 27.04.12 12.11, Erlend Garåsen wrote:
>
> Many thanks for your suggestions and help, Karl. Using a filesystem
> crawl was actually a good idea for debugging/testing. To install a new
> version of Solr is not that easy on our test server for many reasons,
> generally because it is under control of another division dealing with
> servers at the uni, even though I can get root access. Anyway, according
> to the logs on our Solr 3.2 server, it seems that MCF successfully
> managed to delete one test document I removed:
> [2012-04-27 11:18:33.092] {delete=[file:/tmp/mcf/docs/app_lasso.pdf]} 0 7
> [2012-04-27 11:18:33.092] [] webapp=/solr path=/update params={}
> status=0 QTime=7
>
> The result code is 200 according to Simple History in MCF.
>
> I entered the passwords once again for the Solr servers into the Solr
> output configuration, deleted and uploaded our SSL certificate once
> again before I did the filesystem test. I should have performed the
> tests prior to the password updates.
>
> The crawl will start again later today at 6 pm on our production server,
> so I will try to figure out whether we still have problems later. I'm
> going to Scotland later this evening for some days without my laptop, so
> I cannot check the status of my crawl before I'm back, but I'll let my
> colleague watch the logs.
>
> Erlend
>
> On 26.04.12 21.14, Karl Wright wrote:
>> Hi Erlend,
>>
>> I had some time today and was able to verify that everything worked
>> fine against what I have currently on my laptop, which is Solr 3.2.
>> The second job run looks like this:
>>
>> 04-26-2012 15:11:44.154 job end 1335467343879(test) 0 1
>> 04-26-2012 15:11:34.159 document deletion (solr)
>> file:/C:/testcrawl/there.txt 200 0 117
>> 04-26-2012 15:11:24.690 read document C:\testcrawl OK 0 1
>> 04-26-2012 15:11:24.494 job start 1335467343879(test) 0 1
>>
>> So it appears that either something changed in Solr, or SSL support is
>> broken, or your network is not permitting a valid HTTP response for
>> some reason.
>>
>> Karl
>>
>>
>> On Thu, Apr 26, 2012 at 11:10 AM, Karl Wright<daddywri@gmail.com> wrote:
>>> Hi Erlend,
>>>
>>> Can you try the following:
>>>
>>> (1) Make a fresh Solr checkout of 3.6 or whatever Solr version you are
>>> using, and build it
>>> (2) Start it
>>> (3) Run a simple filesystem crawl using a Solr connection that is
>>> created with the default values
>>> (4) Delete a file in your filesystem that was crawled
>>> (5) Crawl again
>>>
>>> Does the deletion happen OK?
>>>
>>> AFAIK, nothing has changed in the Solr connector that should affect
>>> the ability to delete. This test will confirm that it is still
>>> working.
>>>
>>> Thanks,
>>> Karl
>>>
>>>
>>> On Thu, Apr 26, 2012 at 10:19 AM, Erlend Garåsen
>>> <e.f.garasen@usit.uio.no> wrote:
>>>> It seems that MCF cannot delete documents from Solr. A timeout
>>>> occurs, and
>>>> the job stops after a while.
>>>>
>>>> This is what I can see from the log:
>>>> WARN 2012-04-20 18:24:30,373 (Worker thread '16') - Service
>>>> interruption
>>>> reported for job 1327930125433 connection 'Web crawler': Ingestion API
>>>> socket timeout exception waiting for response code: Read timed out;
>>>> ingestion will be retried again later
>>>>
>>>> If I take a further look in Simple History, it seems that this error is
>>>> related to document deletion.
>>>>
>>>> I have tried to delete the document manually by using curl from the
>>>> same
>>>> server MCF is installed on in case we have some access restrictions,
>>>> but
>>>> Curr succeeded.
>>>>
>>>> We do not have any problems with adding, the timeout only occurs while
>>>> deleting documents.
>>>>
>>>> I have checked our Solr configuration. MCF does use the correct path
>>>> for
>>>> document deletion, i.e. /update.
>>>>
>>>> The correct realm, username and password for our Solr server are
>>>> entered
>>>> correctly and the SSL certificate is valid as well.
>>>>
>>>> Erlend
>>>>
>>>> --
>>>> Erlend Garåsen
>>>> Center for Information Technology Services
>>>> University of Oslo
>>>> P.O. Box 1086 Blindern, N-0317 OSLO, Norway
>>>> Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968,
>>>> VIP: 31050
>
>


-- 
Erlend Garåsen
Center for Information Technology Services
University of Oslo
P.O. Box 1086 Blindern, N-0317 OSLO, Norway
Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050

Mime
View raw message