tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris A. Mattmann (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-1438) PhoneExtractingContentHandler to not add individual MD entries for individual phone numbers
Date Mon, 13 Oct 2014 05:14:33 GMT

    [ https://issues.apache.org/jira/browse/TIKA-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14168955#comment-14168955
] 

Chris A. Mattmann commented on TIKA-1438:
-----------------------------------------

+1, yes multi-valued single entry would be my vote.

> PhoneExtractingContentHandler to not add individual MD entries for individual phone numbers
> -------------------------------------------------------------------------------------------
>
>                 Key: TIKA-1438
>                 URL: https://issues.apache.org/jira/browse/TIKA-1438
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Minor
>             Fix For: 1.7
>
>         Attachments: TIKA-1438.patch
>
>
> Right now we have the PhoneExtractingContentHandler adding phone numbers as individual
metadata entires.... I feel that this is cumbersome.
> An example would be that we have a webpage with phone numbers on it, we then have many
fields of the same type with different values!
> I propose we reverse this and have one field with multiple values.
> I would fully understand the current behaviour if we wished to augment the phone numbers
further by associating dialing code, country, carrier, etc, however we are not currently doing
this.
> Patch coming for trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message