manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Trouble indexing a Twitter search in RSS format
Date Fri, 12 Aug 2011 20:58:36 GMT
When I drop any of these URLs into my browser, I get redirected to a
login screen.  Therefore it looks to me like Twitter does some kind of
session-based login, tracked with cookies.  That would require
maintenance of session cookies which the RSS connector simply does not
do, and the coding of a login sequence as well.

This is not a straightforward feature to add to the RSS connector, by any means.

The web connector does have support for login sequencing and cookie
session maintenance, and it does know how to chase RSS feeds, so that
might be an option for you to try.  The problem is that most login
sequences are non-trivial to set up and you will need a lot of
patience and web spelunking skills to get it right.  The documentation
is of some help but really could use a good example.


Hope this helps.
Karl

On Fri, Aug 12, 2011 at 4:42 PM, K McGonigal <kmcgoniga@gmail.com> wrote:
> Sorry to bother everyone again but I'm having trouble with an RSS connector
> job on a Twitter search. When I try to run a job on
> http://search.twitter.com/search.rss?q=Campylobacter the fetch appears to
> work OK, but the document ingestion does not occur.
>
> I was wondering if it is just my setup, or could it be the redirection that
> Twitter does on the links. For instance, a link shown in the RSS feed as
> http://twitter.com/VashinkaInuiel/statuses/101493222852923393 redirects to
> http://twitter.com/#!/VashinkaInuiel/statuses/101493222852923393 when it is
> followed.
>
> Any help is very appreciated.
>
>
>

Mime
View raw message