tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-2049) Add parser for vcal
Date Fri, 05 Aug 2016 16:47:20 GMT
Tim Allison created TIKA-2049:

             Summary: Add parser for vcal
                 Key: TIKA-2049
                 URL: https://issues.apache.org/jira/browse/TIKA-2049
             Project: Tika
          Issue Type: Improvement
            Reporter: Tim Allison
            Priority: Trivial

vcal files can contain embedded html.  We used to detect them as html and extract content
roughly correctly.  Now that they are being correctly detected, but are subclasses of the
text/plain, we're getting html markup in the extracted text.

This message was sent by Atlassian JIRA

View raw message