manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Error with webcrawler
Date Tue, 04 Jun 2013 20:43:32 GMT
Any news on this?
Karl


On Tue, Jun 4, 2013 at 1:10 PM, Stephane Gamard <stephane@gamard.net> wrote:

> Hi Karl, I'll try ASAP and will let you know. Prolly in 2h or so
>
> Sent from my iPhone
>
> On Jun 4, 2013, at 7:07 PM, Karl Wright <daddywri@gmail.com> wrote:
>
> Hi Stephane,
>
> I just committed a change that may well fix this: r1489521.  Please synch
> up and let me know.  If it doesn't, I will be happy to disable the feature
> until I have a fix.
>
> Karl
>
>
>
> On Tue, Jun 4, 2013 at 1:02 PM, Karl Wright <daddywri@gmail.com> wrote:
>
>> I'm pretty sure this is related to changes that were made for
>> CONNECTORS-693.  If I can't get any further shortly, I will disable those
>> changes until I can figure out what is wrong.
>>
>> Karl
>>
>>
>>
>> On Tue, Jun 4, 2013 at 1:00 PM, Stephane Gamard <stephane@gamard.net>wrote:
>>
>>> Hi Karl,
>>>
>>> I've looked into the simpleHistory, and unfortunately the message is the
>>> same as in the log:
>>>
>>>  06-04-2013 18:55:12.876fetch http://wiki.apache.org/solr/
>>> -1042 765Interrupted: IO exception reading response stream:
>>> org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ThrottledInputstream.read()
>>> returned value out of range -1..255: -117
>>>
>>>
>>> On Tue, Jun 4, 2013 at 6:54 PM, Karl Wright <daddywri@gmail.com> wrote:
>>>
>>>> Hi Stephane,
>>>>
>>>> I'll look into the problem, but it would be great if you could have a
>>>> look at the Simple History and tell me if you see a stack trace there.
>>>> I've not seen this issue before and having a line number would be really
>>>> helpful.
>>>>
>>>> Karl
>>>>
>>>>
>>>> On Tue, Jun 4, 2013 at 12:45 PM, Stephane Gamard <stephane@gamard.net>wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>>
>>>>> Just checked out the trunk to test CONNECTORS-700 and glad it works
>>>>> (thanks a whole bunch!). Just wondering about a new but I have. The
>>>>> previously running web crawler is now broken. I've dropped it and created
a
>>>>> new one and I have the following error:
>>>>>
>>>>>
>>>>>  WARN 2013-06-04 18:40:08,393 (Worker thread '1') - Pre-ingest service
>>>>> interruption reported for job 1370363902673 connection
>>>>> 'default-web-repository': IO exception reading response stream:
>>>>> org.apache.manifoldcf.crawler.connectors.webcrawler.ThrottledFetcher$ThrottledInputstream.read()
>>>>> returned value out of range -1..255: -117
>>>>>
>>>>>
>>>>> Then the job stays at that status:
>>>>>  **Restart**<http://781fa7e3fa00bc3b270c3411ed3bd1da.searchbox.com/mcf/showjobstatus.jsp>
>>>>>   **Restart minimal**<http://781fa7e3fa00bc3b270c3411ed3bd1da.searchbox.com/mcf/showjobstatus.jsp>
>>>>>   **Pause**<http://781fa7e3fa00bc3b270c3411ed3bd1da.searchbox.com/mcf/showjobstatus.jsp>
>>>>>   **Abort**<http://781fa7e3fa00bc3b270c3411ed3bd1da.searchbox.com/mcf/showjobstatus.jsp>
>>>>>   wiki-documentationRunning Tue Jun 04 18:40:04 CEST 20131 11
>>>>>
>>>>>
>>>>> Any idea about why?
>>>>>
>>>>> Attached is the full log
>>>>>
>>>>> _Stephane
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message