tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergiy Shyrkov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (TIKA-2812) NPE when parsing text with write limit set on IBM JDK
Date Fri, 11 Jan 2019 11:23:00 GMT

     [ https://issues.apache.org/jira/browse/TIKA-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sergiy Shyrkov updated TIKA-2812:
---------------------------------
    Description: 
We have updated Tika from version 1.14 to recently released 1.20 and are now experiencing
an issue with parsing of texts when write limit is set (we are using {{WriteOutContentHandler}})
on IBM JDK 8.

Test class [^TikaTest.java] and test file [^test.txt] are attached.

The issue is present on IBM JDK 8 [^output-ibm-jdk-tika-1.20.txt], but not on Oracle [^output-oracle-jdk-tika-1.20.txt]
or Open JDK 8 [^output-open-jdk-tika-1.20.txt].

With Tika 1.14 we had no this issue [^output-ibm-jdk-tika-1.14.txt].

Analysis:
 With the fix in TIKA-2668 ([https://github.com/apache/tika/commit/89a588e4d8d2aa44a9d3c965d514c18c7d3c134d#diff-5a28529cf32968d35a5036172cd8f74fL41)] a
line was removed from the constructor of the {{TaggedSAXException}} class:
{code:java}
initCause(original); // SAXException has it's own chaining mechanism!
{code}
Bringing the line back, solves our issue with JDK 8, but breaks the things on JDK 11 [^output-oracle-jdk-11-tika-1.20.txt].

Is there any chance the class {{TaggedSAXException}} can be made compatible with JDK 8 and
JDK 11 (both Oracle/OpenJDK and IBM one)?

Thank you in advance!

Kind regards
 Sergiy Shyrkov

  was:
We have updated Tika from version 1.14 to recently released 1.20 and are now experiencing
an issue with parsing of texts when write limit is set (we are using {{WriteOutContentHandler}})
on IBM JDK 8.

Test class [^TikaTest.java] and test file [^test.txt] are attached.

The issue is present on IBM JDK 8 [^output-ibm-jdk-tika-1.20.txt], but not on Oracle [^output-oracle-jdk-tika-1.20.txt]
or Open JDK 8 [^output-open-jdk-tika-1.20.txt].

With Tika 1.14 we had no this issue [^output-ibm-jdk-tika-1.14.txt].

Analysis:
With the fix in TIKA-2668 ([https://github.com/apache/tika/commit/89a588e4d8d2aa44a9d3c965d514c18c7d3c134d#diff-5a28529cf32968d35a5036172cd8f74fL41)] a
line was removed from the constructor of the {{TaggedSAXException}} class:
{code}
initCause(original); // SAXException has it's own chaining mechanism!
{code}

Bringing the line back, solves our issue with JDK 8, but breaks the things on JDK 11.

Is there any chance the class {{TaggedSAXException}} can be made compatible with JDK 8 and
JDK 11 (both Oracle/OpenJDK and IBM one)?

Thank you in advance!

Kind regards
Sergiy Shyrkov


> NPE when parsing text with write limit set on IBM JDK
> -----------------------------------------------------
>
>                 Key: TIKA-2812
>                 URL: https://issues.apache.org/jira/browse/TIKA-2812
>             Project: Tika
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.20
>         Environment: IBM JDK 8
>            Reporter: Sergiy Shyrkov
>            Priority: Major
>              Labels: regression
>         Attachments: TikaTest.java, output-ibm-jdk-tika-1.14.txt, output-ibm-jdk-tika-1.20.txt,
output-open-jdk-tika-1.20.txt, output-oracle-jdk-11-tika-1.20.txt, output-oracle-jdk-tika-1.20.txt,
test.txt
>
>
> We have updated Tika from version 1.14 to recently released 1.20 and are now experiencing
an issue with parsing of texts when write limit is set (we are using {{WriteOutContentHandler}})
on IBM JDK 8.
> Test class [^TikaTest.java] and test file [^test.txt] are attached.
> The issue is present on IBM JDK 8 [^output-ibm-jdk-tika-1.20.txt], but not on Oracle
[^output-oracle-jdk-tika-1.20.txt] or Open JDK 8 [^output-open-jdk-tika-1.20.txt].
> With Tika 1.14 we had no this issue [^output-ibm-jdk-tika-1.14.txt].
> Analysis:
>  With the fix in TIKA-2668 ([https://github.com/apache/tika/commit/89a588e4d8d2aa44a9d3c965d514c18c7d3c134d#diff-5a28529cf32968d35a5036172cd8f74fL41)] a
line was removed from the constructor of the {{TaggedSAXException}} class:
> {code:java}
> initCause(original); // SAXException has it's own chaining mechanism!
> {code}
> Bringing the line back, solves our issue with JDK 8, but breaks the things on JDK 11
[^output-oracle-jdk-11-tika-1.20.txt].
> Is there any chance the class {{TaggedSAXException}} can be made compatible with JDK
8 and JDK 11 (both Oracle/OpenJDK and IBM one)?
> Thank you in advance!
> Kind regards
>  Sergiy Shyrkov



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message