tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith R. Bennett (JIRA)" <j...@apache.org>
Subject [jira] Created: (TIKA-18) "Office" interface should be renamed "MSOffice".
Date Thu, 13 Sep 2007 21:34:32 GMT
"Office" interface should be renamed "MSOffice".

                 Key: TIKA-18
                 URL: https://issues.apache.org/jira/browse/TIKA-18
             Project: Tika
          Issue Type: Improvement
          Components: general
    Affects Versions: 0.1-incubator
            Reporter: Keith R. Bennett
            Priority: Minor
             Fix For: 0.1-incubator

The org.apache.tika.metadata.Office interface should be renamed MSOffice to make clear to
the reader that it applies to Microsoft Office and not (necessarily) OpenOffice.

According to Chris Mattmann, the author of this interface:

"Basically it [the Office interface] applies to whatever office met fields
are made available via the Jakarta POI library, which is where we got the
list of met keys from while using the interface on Apache Nutch.

"Each one of the interfaces in the org.apache.tika.metadata package are
collections of met keys that we found used in commonplace throughout the
Nutch code, and that we tried to generalize into a canonical source. The
Office interface, were properties that we saw used in the parse-msword,
parse-msexcel/etc. plugins, which all rely on Jakarta POI I believe."

Patch coming...

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message