tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Burch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-2744) rss+xml doesnt accept files with .xml extension
Date Wed, 17 Oct 2018 16:36:00 GMT

    [ https://issues.apache.org/jira/browse/TIKA-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653808#comment-16653808
] 

Nick Burch commented on TIKA-2744:
----------------------------------

I've added a test RSS 2.0 file to Tika's test documents, and it's correctly detected for me
whether called {{rsstest_20.rss}} or {{rsstest_20.rss.xml}}

Can you give us some more details on how you're calling Tika, what file(s) you're having the
trouble with, and exactly what isn't working?

> rss+xml doesnt accept files with .xml extension
> -----------------------------------------------
>
>                 Key: TIKA-2744
>                 URL: https://issues.apache.org/jira/browse/TIKA-2744
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Martin
>            Priority: Major
>
> Hello, 
> if i try to validate application/rss+xml file with .xml extension and it fails. 
> I would say, that is a bug.
> I think the .RSS extension is only until version 1.0. From 2.0 is rss xml based and it
should(could) have .xml extension:
> Source:
> https://www.w3schools.com/xml/xml_rss.asp 
> "Get Your RSS Feed Up On The Web
> Having an RSS document is not useful if other people cannot reach it.
> Now it's time to get your RSS file up on the web. Here are the steps:
> 1. Name your RSS file. Notice that the file must have an .xml extension."
> or specification on Harvard university:
> https://cyber.harvard.edu/rss/rss.html
> there is example:
> "Its value is the name of the RSS channel that the item came from, derived from its <title>.
It has one required attribute, url, which links to the XMLization of the source.
> Example of file:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message