tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-2762) Capture short fields (<150 chars) in EnviParserHeader Metadata
Date Wed, 31 Oct 2018 04:10:00 GMT

    [ https://issues.apache.org/jira/browse/TIKA-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669574#comment-16669574
] 

Hudson commented on TIKA-2762:
------------------------------

FAILURE: Integrated in Jenkins build tika-2.x-windows #339 (See [https://builds.apache.org/job/tika-2.x-windows/339/])
TIKA-2762 Capture short fields (<150 chars) in EnviParserHeader Metadata (lewis.mcgibbney:
rev 68573d1a17315d134de6bec13666e02f3ec2aa45)
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java
TIKA-2762 Capture short fields (<150 chars) in EnviParserHeader Metadata (lewis.mcgibbney:
rev d6eb8b9fbf6eb6f1198c5b6f54931b065519bf79)
* (edit) tika-core/src/main/java/org/apache/tika/config/TikaConfig.java
* (edit) tika-core/src/test/java/org/apache/tika/detect/MagicDetectorTest.java
* (edit) tika-core/src/test/java/org/apache/tika/detect/NameDetectorTest.java
* (add) tika-core/src/test/resources/test-documents/ang20150420t182050_corr_v1e_img.hdr
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java


> Capture short fields (<150 chars) in EnviParserHeader Metadata
> --------------------------------------------------------------
>
>                 Key: TIKA-2762
>                 URL: https://issues.apache.org/jira/browse/TIKA-2762
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.19.1
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Major
>             Fix For: 1.20
>
>
> I have always wanted to capture more metadata for the EnviHeader files. Right now everything
is shoved into the records content and I think we could improve it.
> I've implemented a rudimentary parser improvement with essentially captures any reasonably
sized lines items (<150 chars) which can then be populated up to Metadata level making
faceted search over ENVI .hdr documents a much easier task.
> PR coming up. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message