tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ken Krugler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1835) LinkContentHandler skips iframe and rel tags
Date Tue, 05 Apr 2016 20:27:25 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227078#comment-15227078

Ken Krugler commented on TIKA-1835:

I’d rolled in Markus’s patch directly to support these other link types, but I wish I’d
remembered the old TIKA-503 discussion, as it would have been better to make that support
conditional on using a different constructor, as it’s usually not a good idea to surprise
consumers of parse output with new types of data (links).

[~markus.jelsma@openindex.io] - would it be OK to make the above "extra links" support conditional
on a new constructor? Or do you think it doesn't matter? And what about adding the script
element to that set?

> LinkContentHandler skips iframe and rel tags
> --------------------------------------------
>                 Key: TIKA-1835
>                 URL: https://issues.apache.org/jira/browse/TIKA-1835
>             Project: Tika
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.11
>            Reporter: Markus Jelsma
>            Assignee: Ken Krugler
>             Fix For: 1.12
>         Attachments: TIKA-1835.patch
> As simple as it gets, link and iframe tags were never implemented in LinkContentHandler.
NUTCH-1233 kind of requires it.

This message was sent by Atlassian JIRA

View raw message