nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Tang <>
Subject RSS Parser Bug!?
Date Thu, 08 Sep 2005 05:58:35 GMT
Hi Guys

Did someone install parse-rss and try to fetch rss feeds?
It failed on my side. I enabled the plugin and it fetched, not rss
parser didnot work.
My feed is

Here is the error:

org.apache.nutch.fetcher.Fetcher$FetcherThread [11] - fetch okay, but
can't parse, reason:
failed(2,203): Content-Type not text/html: application/xml;

The content-type is application/xml. Mattmann's comment is this:
        // check that contentType is one we can handle
        String contentType = content.getContentType();
        if (contentType != null
                && (!contentType.startsWith("text/xml") &&
            return new ParseStatus(ParseStatus.FAILED_INVALID_FORMAT,
                    "Content-Type not text/xml or application/rss+xml: "
                            + contentType).getEmptyParse();

So, it does not "application/xml" content type yet?

Keep Discovering ... ...

View raw message