tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (TIKA-1609) Leverage Google's LibPhonenumber for enhanced phone number extraction and metadata modeling
Date Sun, 21 May 2017 15:40:09 GMT

     [ https://issues.apache.org/jira/browse/TIKA-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris A. Mattmann updated TIKA-1609:
------------------------------------
    Fix Version/s:     (was: 1.15)
                   1.16

> Leverage Google's LibPhonenumber for enhanced phone number extraction and metadata modeling
> -------------------------------------------------------------------------------------------
>
>                 Key: TIKA-1609
>                 URL: https://issues.apache.org/jira/browse/TIKA-1609
>             Project: Tika
>          Issue Type: New Feature
>          Components: core
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>             Fix For: 1.16
>
>
> Google's Libphonenumber can provide us with comprehensive support for modeling Phone
number metadata properly in Tika.
> During the development of this patch I realized two things, namely
>  * This is not a parser as such as Phone numbers are not mapped to any particular Mimetype
>  * In addition, there can be many phone numbers per document, so this is most likely
a Content Handler of sorts
>  * Tika's Metadata support is currently too restrictive to allow us to persist many complex
objects e.g. String, Object. We need to expand Meatdata support over and above String, String[].
> https://github.com/googlei18n/libphonenumber/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message