manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: WEB: Illegal seed URL
Date Tue, 06 Dec 2011 19:34:26 GMT
On second thought, "illegal seed" can also mean that the seed is
excluded from the crawl due to your inclusion/exclusion regexp lists.
Might want to check that out too.

Karl

On Tue, Dec 6, 2011 at 2:23 PM, Karl Wright <daddywri@gmail.com> wrote:
> The URL as stated is fine and is pretty standard.  I don't think
> there's a problem there, unless you inadvertantly fixed something when
> you changed the hostname.
>
> Can you look at the log - there may well be a stack trace, especially
> if you have <property name="org.apache.manifoldcf.connectors"
> value="DEBUG"/> set.  I'd love to see what the trace is.
>
> Karl
>
> On Tue, Dec 6, 2011 at 1:52 PM, Michael Kelleher <mj.kelleher@gmail.com> wrote:
>> Here is my seed URL (minus the hostname):
>>  https://hostname.com/vwebv/search?searchArg=dvd&searchCode=SALL&searchType=1&recCount=100
>>
>> I am using a Web Crawler connection that has been tested with the
>> NullOutputConnector - so I dont think the issue can be here
>>
>>
>> I am also using the Solr Output Connector - this had been throwing an
>> Exception till I fixed the core name - this is the first time I have used
>> this.  So, maybe I dont have things configured correct here.  However, there
>> are no exceptions in the log.  Also, I am not using authentication at all on
>> Solr.
>>
>>
>> I looked at the class:
>> connectors\webcrawler\connector\src\main\java\org\apache\manifoldcf\crawler\connectors\webcrawler\WebcrawlerConnector.java
>> and it was not Obvious what the issue is.
>>
>> Also, in logging.ini - I changed the logging level to DEBUG and restarted
>> before I tested the crawl, which further obscures the logic to me in
>> WebcrawlerConnector.java
>>
>> Is there somewhere else I can set logging levels.  I am not sure my change
>> to logging.ini is having any effect.  Also, is there some other test you
>> might suggest?
>>
>> thanks.
>>
>> --mike

Mime
View raw message