From dev-return-18648-apmail-tika-dev-archive=tika.apache.org@tika.apache.org Sun Oct 18 19:44:14 2015 Return-Path: X-Original-To: apmail-tika-dev-archive@www.apache.org Delivered-To: apmail-tika-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2A3AF188D6 for ; Sun, 18 Oct 2015 19:44:14 +0000 (UTC) Received: (qmail 37991 invoked by uid 500); 18 Oct 2015 19:44:07 -0000 Delivered-To: apmail-tika-dev-archive@tika.apache.org Received: (qmail 37753 invoked by uid 500); 18 Oct 2015 19:44:07 -0000 Mailing-List: contact dev-help@tika.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tika.apache.org Delivered-To: mailing list dev@tika.apache.org Received: (qmail 37354 invoked by uid 99); 18 Oct 2015 19:44:07 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Oct 2015 19:44:07 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 8EA562C0451 for ; Sun, 18 Oct 2015 19:44:07 +0000 (UTC) Date: Sun, 18 Oct 2015 19:44:07 +0000 (UTC) From: "Chris A. Mattmann (JIRA)" To: dev@tika.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (TIKA-819) Make Option to Exclude Embedded Files' Text for Text Content MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/TIKA-819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-819: ----------------------------------- Fix Version/s: (was: 1.11) 1.12 > Make Option to Exclude Embedded Files' Text for Text Content > ------------------------------------------------------------ > > Key: TIKA-819 > URL: https://issues.apache.org/jira/browse/TIKA-819 > Project: Tika > Issue Type: New Feature > Components: general > Affects Versions: 1.0 > Environment: Windows-7 + JDK 1.6 u26 > Reporter: Albert L. > Fix For: 1.12 > > > It would be nice to be able to disable text content from embedded files. > For example, if I have a DOCX with an embedded PPTX, then I would like the option to disable text from the PPTX from showing up when asking for the text content from DOCX. In other words, it would be nice to have the option to get text content *only* from the DOCX instead of the DOCX+PPTX. -- This message was sent by Atlassian JIRA (v6.3.4#6332)