From dev-return-31375-apmail-tika-dev-archive=tika.apache.org@tika.apache.org Wed Jul 31 20:17:06 2019 Return-Path: X-Original-To: apmail-tika-dev-archive@www.apache.org Delivered-To: apmail-tika-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by minotaur.apache.org (Postfix) with SMTP id 7F8B719DC1 for ; Wed, 31 Jul 2019 20:17:06 +0000 (UTC) Received: (qmail 63961 invoked by uid 500); 31 Jul 2019 20:17:02 -0000 Delivered-To: apmail-tika-dev-archive@tika.apache.org Received: (qmail 63926 invoked by uid 500); 31 Jul 2019 20:17:02 -0000 Mailing-List: contact dev-help@tika.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tika.apache.org Delivered-To: mailing list dev@tika.apache.org Received: (qmail 63910 invoked by uid 99); 31 Jul 2019 20:17:02 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Jul 2019 20:17:02 +0000 Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id EB74AE2F8E for ; Wed, 31 Jul 2019 20:17:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 322AC26642 for ; Wed, 31 Jul 2019 20:17:00 +0000 (UTC) Date: Wed, 31 Jul 2019 20:17:00 +0000 (UTC) From: "Hudson (JIRA)" To: dev@tika.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (TIKA-2917) Extract metadata from inline images in PDFs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/TIKA-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897492#comment-16897492 ] Hudson commented on TIKA-2917: ------------------------------ SUCCESS: Integrated in Jenkins build tika-branch-1x #228 (See [https://builds.apache.org/job/tika-branch-1x/228/]) TIKA-2917 -- extract metadata that accompanies inline images (tallison: [https://github.com/apache/tika/commit/fd0eeb93a254de9320d04775f492287a716f5e92]) * (edit) tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java * (add) tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDMetadataExtractor.java * (edit) tika-parsers/src/main/java/org/apache/tika/parser/image/xmp/JempboxExtractor.java * (edit) tika-parsers/src/main/java/org/apache/tika/parser/pdf/AbstractPDF2XHTML.java * (edit) tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java > Extract metadata from inline images in PDFs > ------------------------------------------- > > Key: TIKA-2917 > URL: https://issues.apache.org/jira/browse/TIKA-2917 > Project: Tika > Issue Type: Improvement > Reporter: Tim Allison > Assignee: Tim Allison > Priority: Minor > > Inline images may have XMP associated with them. We are not currently extracting this metadata. -- This message was sent by Atlassian JIRA (v7.6.14#76016)