tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Burch (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-590) Create facility for deeper introspection of media files
Date Sat, 29 Jan 2011 19:11:43 GMT

    [ https://issues.apache.org/jira/browse/TIKA-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988495#action_12988495

Nick Burch commented on TIKA-590:

I'm not sure how that would fit into the current model. However, something similar that might
work is setting something in the parse context to indicate how much work you'd like the parsers
to do

A rough idea would be something like:
public enum ParserExtraWorkLevel  { NONE, LIMITED, FULL }

parseContext.set(ParserExtraWorkLevel.class, ParserExtraWorkLevel.FULL)
parser.parse(stream, handler, metadata, parseContext);

Then inside the parser you could check for the extra work level, and do more if requested.

It's probably worth coming up with a concrete case first though, and when we have a patch
that introduces some optional "expensive" work to a parser we can decide on the best way forward.

> Create facility for deeper introspection of media files
> -------------------------------------------------------
>                 Key: TIKA-590
>                 URL: https://issues.apache.org/jira/browse/TIKA-590
>             Project: Tika
>          Issue Type: Wish
>          Components: metadata
>            Reporter: Andre-John Mas
> This feature would allow applications to dig deeper into files to define meta-data that
is not presented as a tag in the file. For example a file that has no duration information
could with a little more work provide this missing information. The idea is to let the API
user make a difference between data that is quick to retrieve and data that is slower to retrieve
because of the extra processing needed to get that information.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message