tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (Resolved) (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (TIKA-824) Extract rel attr with LinkContentHandler
Date Tue, 03 Jan 2012 04:20:21 GMT

     [ https://issues.apache.org/jira/browse/TIKA-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris A. Mattmann resolved TIKA-824.
------------------------------------

    Resolution: Fixed

- patch applied in r1226629. Thanks Markus!
                
> Extract rel attr with LinkContentHandler
> ----------------------------------------
>
>                 Key: TIKA-824
>                 URL: https://issues.apache.org/jira/browse/TIKA-824
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.0, 1.1
>            Reporter: Markus Jelsma
>            Assignee: Chris A. Mattmann
>            Priority: Minor
>             Fix For: 1.1
>
>         Attachments: TIKA-824-trunk-1.patch
>
>
> For Nutch we need to extract URL's but need the rel attribute to check for the nofollow
value. I've patched the code to return this information in the Link object. It's been tested
and i can read the rel in Nutch now.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message