manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From K McGonigal <kmcgon...@gmail.com>
Subject Field mapping for RSS feed
Date Tue, 02 Aug 2011 15:41:58 GMT
Hi,

I'm trying to use ManifoldCF to index an RSS feed into Solr.  It sort of
works, but my main problem at the moment is that the *channel* description
from the RSS feed is written to the "description" field in Solr when I would
really like the *item* description to be written instead.

I have a typical RSS feed with the general structure:

<rss>
    <channel>
        <title></title>
        <link></link>
        <description> *** the description I don't want *** </description>
        <item>
            <title></title>
            <link></link>
            <pubDate></pubDate>
            <description> *** the description I do want *** </description>
            <author></author>
            <category></category>
        </item>
    </channel>
</rss>

I tried setting up the  field mapping on the job with the XPath address of
the second description, i.e. "/rss/channel/item/description" as the source,
but that did not work.

I suspect I'm overlooking something simple, but I've spent 2 days trying to
solve it.  I would be grateful for any help.


Kate McGonigal

Mime
View raw message