tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Tikhonov <olegtikho...@gmail.com>
Subject Re: [VOTE] Release Apache Tika 1.15 Candidate #1
Date Wed, 24 May 2017 19:25:50 GMT
Cannot reproduce after having done some workarounds ...



On Wed, May 24, 2017 at 3:05 AM, Allison, Timothy B. <tallison@mitre.org>
wrote:

> Hi Oleg,
>   What's your error on that unit test?
>
> -----Original Message-----
> From: olegtikhonov@gmail.com [mailto:olegtikhonov@gmail.com] On Behalf Of
> Oleg Tikhonov
> Sent: Tuesday, May 23, 2017 4:33 PM
> To: dev@tika.apache.org
> Subject: Re: [VOTE] Release Apache Tika 1.15 Candidate #1
>
> Also put
> ./tika-dl/src/test/java/org/apache/tika/dl/imagerec/
> DL4JInceptionV3NetTest.java
> @Ignore because I do not have any DL installed on my comp.
>
>
> On Tue, May 23, 2017 at 11:00 PM, Oleg Tikhonov <oleg@apache.org> wrote:
>
> > Hi guys,
> > Here is wrong ...
> > <parent>
> >     <groupId>org.apache.tika</groupId>
> >     <artifactId>tika-parent</artifactId>
> >     <version>1.16-SNAPSHOT</version>
> >     <relativePath>tika-parent/pom.xml</relativePath>
> >   </parent>
> >
> >
> > If you are cloning the project, the upper level pom contains this.
> > The fix is to change 1.16-SNAPSHOT to 1.15
> >
> > What i did was:
> > git clone https://github.com/apache/tika.git
> >
> > Any suggestions?
> >
> > BR,
> > OLeg
> >
> >
> >
> >
> > On Tue, May 23, 2017 at 3:01 PM, Allison, Timothy B.
> > <tallison@mitre.org>
> > wrote:
> >
> >> I _think_ it is included.  See below for the two options for parsing
> >> testZipEncrypted.zip.
> >>
> >> Are you not seeing this behavior?  Were you expecting different
> behavior?
> >>
> >>
> >> 1) RecursiveParserWrapper
> >>
> >>         List<Metadata> metadataList = getRecursiveMetadata("testZipE
> >> ncrypted.zip");
> >>         debug(metadataList);
> >>
> >> yields:
> >>
> >> 0: X-Parsed-By : org.apache.tika.parser.DefaultParser
> >> 0: X-Parsed-By : org.apache.tika.parser.pkg.PackageParser
> >> 0: X-TIKA:EXCEPTION:embedded_stream_exception :
> >> org.apache.tika.exception.EncryptedDocumentException: stream
> >> (encrypted.txt) is encrypted
> >>         at
> >> org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageP
> >> arser.java:306)
> >>         at
> >> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser
> >> .java:230)
> >>         at
> >> org.apache.tika.parser.CompositeParser.parse(CompositeParser
> >> .java:280)
> >>         at
> >> org.apache.tika.parser.CompositeParser.parse(CompositeParser
> >> .java:280)
> >>         at
> >> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectPars
> >> er.java:135)
> >>         at
> >> org.apache.tika.parser.RecursiveParserWrapper.parse(Recursiv
> >> eParserWrapper.java:158)
> >>         at org.apache.tika.TikaTest.getRecursiveMetadata(TikaTest.java:
> >> 221)
> >>         at org.apache.tika.TikaTest.getRecursiveMetadata(TikaTest.java:
> >> 213)
> >>         at
> >> org.apache.tika.parser.pkg.ZipParserTest.testZipEncrypted(Zi
> >> pParserTest.java:213)
> >>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>         at
> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
> >> ssorImpl.java:62)
> >>         at
> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
> >> thodAccessorImpl.java:43)
> >>         at java.lang.reflect.Method.invoke(Method.java:498)
> >>         at
> >> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(
> >> FrameworkMethod.java:50)
> >>         at
> >> org.junit.internal.runners.model.ReflectiveCallable.run(Refl
> >> ectiveCallable.java:12)
> >>         at
> >> org.junit.runners.model.FrameworkMethod.invokeExplosively(Fr
> >> ameworkMethod.java:47)
> >>         at
> >> org.junit.internal.runners.statements.InvokeMethod.evaluate(
> >> InvokeMethod.java:17)
> >>         at org.junit.internal.runners.statements.RunBefores.evaluate(
> >> RunBefores.java:26)
> >>         at org.junit.runners.ParentRunner.runLeaf(
> ParentRunner.java:325)
> >>         at
> >> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit
> >> 4ClassRunner.java:78)
> >>         at
> >> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit
> >> 4ClassRunner.java:57)
> >>         at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> >>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:
> >> 71)
> >>         at org.junit.runners.ParentRunner.runChildren(ParentRunner.
> >> java:288)
> >>         at org.junit.runners.ParentRunner.access$000(ParentRunner.java:
> >> 58)
> >>         at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:
> >> 268)
> >>         at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> >>         at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> >>         at
> >> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs
> >> (JUnit4IdeaTestRunner.java:68)
> >>         at
> >> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.star
> >> tRunnerWithArgs(IdeaTestRunner.java:51)
> >>         at
> >> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsA
> >> ndStart(JUnitStarter.java:242)
> >>         at
> >> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStart
> >> er.java:70)
> >>
> >> 0: X-TIKA:parse_time_millis : 34
> >> 0: X-TIKA:content : <html xmlns="http://www.w3.org/1999/xhtml">
> >> <head>
> >> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser"
> >> />
> >> <meta name="X-Parsed-By" content="org.apache.tika.
> parser.pkg.PackageParser"
> >> />
> >> <meta name="Content-Type" content="application/zip" />
> >> <title></title> </head> <body><div class="embedded"
> >> id="unencrypted.txt" /> <div
> >> class="package-entry"><h1>unencrypted.txt</h1>
> >> </div>
> >> <p>encrypted.txt</p>
> >> </body></html>
> >> 0: Content-Type : application/zip
> >> 1: date : 2017-03-21T13:07:48Z
> >> 1: X-Parsed-By : org.apache.tika.parser.DefaultParser
> >> 1: X-Parsed-By : org.apache.tika.parser.txt.TXTParser
> >> 1: resourceName : unencrypted.txt
> >> 1: dcterms:modified : 2017-03-21T13:07:48Z
> >> 1: Last-Modified : 2017-03-21T13:07:48Z
> >> 1: Last-Save-Date : 2017-03-21T13:07:48Z
> >> 1: embeddedRelationshipId : unencrypted.txt
> >> 1: meta:save-date : 2017-03-21T13:07:48Z
> >> 1: Content-Encoding : windows-1252
> >> 1: X-TIKA:parse_time_millis : 3
> >> 1: modified : 2017-03-21T13:07:48Z
> >> 1: X-TIKA:content : <html xmlns="http://www.w3.org/1999/xhtml">
> >> <head>
> >> <meta name="date" content="2017-03-21T13:07:48Z" /> <meta
> >> name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser"
> >> />
> >> <meta name="X-Parsed-By" content="org.apache.tika.parser.txt.TXTParser"
> >> />
> >> <meta name="resourceName" content="unencrypted.txt" /> <meta
> >> name="dcterms:modified" content="2017-03-21T13:07:48Z" /> <meta
> >> name="Last-Modified" content="2017-03-21T13:07:48Z" /> <meta
> >> name="Last-Save-Date" content="2017-03-21T13:07:48Z" /> <meta
> >> name="embeddedRelationshipId" content="unencrypted.txt" /> <meta
> >> name="meta:save-date" content="2017-03-21T13:07:48Z" /> <meta
> >> name="Content-Encoding" content="windows-1252" /> <meta
> >> name="modified" content="2017-03-21T13:07:48Z" /> <meta
> >> name="Content-Length" content="13" /> <meta
> >> name="X-TIKA:embedded_resource_path" content="/unencrypted.txt" />
> >> <meta name="Content-Type" content="text/plain; charset=windows-1252"
> >> /> <title></title> </head> <body><p>hello world
</p> </body></html>
> >> 1: Content-Length : 13
> >> 1: X-TIKA:embedded_resource_path : /unencrypted.txt
> >> 1: Content-Type : text/plain; charset=windows-1252
> >>
> >> 2) Classic XML:
> >>
> >>         XMLResult r = getXML("testZipEncrypted.zip");
> >>         for (String n : r.metadata.names()) {
> >>             for (String v : r.metadata.getValues(n)) {
> >>                 System.out.println("meta: "+n + " : "+v);
> >>             }
> >>         }
> >>         System.out.println(r.xml);
> >>
> >> Yields:
> >> meta: X-Parsed-By : org.apache.tika.parser.DefaultParser
> >> meta: X-Parsed-By : org.apache.tika.parser.pkg.PackageParser
> >> meta: X-TIKA:EXCEPTION:embedded_stream_exception :
> >> org.apache.tika.exception.EncryptedDocumentException: stream
> >> (encrypted.txt) is encrypted
> >>         at
> >> org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageP
> >> arser.java:306)
> >>         at
> >> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser
> >> .java:230)
> >>         at
> >> org.apache.tika.parser.CompositeParser.parse(CompositeParser
> >> .java:280)
> >>         at
> >> org.apache.tika.parser.CompositeParser.parse(CompositeParser
> >> .java:280)
> >>         at
> >> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectPars
> >> er.java:135)
> >>         at org.apache.tika.TikaTest.getXML(TikaTest.java:205)
> >>         at org.apache.tika.TikaTest.getXML(TikaTest.java:191)
> >>         at
> >> org.apache.tika.parser.pkg.ZipParserTest.testZipEncrypted(Zi
> >> pParserTest.java:206)
> >>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>         at
> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
> >> ssorImpl.java:62)
> >>         at
> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
> >> thodAccessorImpl.java:43)
> >>         at java.lang.reflect.Method.invoke(Method.java:498)
> >>         at
> >> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(
> >> FrameworkMethod.java:50)
> >>         at
> >> org.junit.internal.runners.model.ReflectiveCallable.run(Refl
> >> ectiveCallable.java:12)
> >>         at
> >> org.junit.runners.model.FrameworkMethod.invokeExplosively(Fr
> >> ameworkMethod.java:47)
> >>         at
> >> org.junit.internal.runners.statements.InvokeMethod.evaluate(
> >> InvokeMethod.java:17)
> >>         at org.junit.internal.runners.statements.RunBefores.evaluate(
> >> RunBefores.java:26)
> >>         at org.junit.runners.ParentRunner.runLeaf(
> ParentRunner.java:325)
> >>         at
> >> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit
> >> 4ClassRunner.java:78)
> >>         at
> >> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit
> >> 4ClassRunner.java:57)
> >>         at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> >>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:
> >> 71)
> >>         at org.junit.runners.ParentRunner.runChildren(ParentRunner.
> >> java:288)
> >>         at org.junit.runners.ParentRunner.access$000(ParentRunner.java:
> >> 58)
> >>         at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:
> >> 268)
> >>         at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> >>         at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> >>         at
> >> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs
> >> (JUnit4IdeaTestRunner.java:68)
> >>         at
> >> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.star
> >> tRunnerWithArgs(IdeaTestRunner.java:51)
> >>         at
> >> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsA
> >> ndStart(JUnitStarter.java:242)
> >>         at
> >> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStart
> >> er.java:70)
> >>
> >> meta: Content-Type : application/zip
> >> <html xmlns="http://www.w3.org/1999/xhtml">
> >> <head>
> >> <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser"
> >> />
> >> <meta name="X-Parsed-By" content="org.apache.tika.
> parser.pkg.PackageParser"
> >> />
> >> <meta name="Content-Type" content="application/zip" />
> >> <title></title> </head> <body><div class="embedded"
> >> id="unencrypted.txt" /> <div
> >> class="package-entry"><h1>unencrypted.txt</h1>
> >> <p>hello world
> >> </p>
> >>
> >> </div>
> >> <p>encrypted.txt</p>
> >> </body></html>
> >>
> >> -----Original Message-----
> >> From: Aeham Abushwashi [mailto:aeham.abushwashi@exonar.com]
> >> Sent: Tuesday, May 23, 2017 3:47 AM
> >> To: user@tika.apache.org; Tim Allison <tallison@apache.org>
> >> Cc: dev@tika.apache.org
> >> Subject: Re: [VOTE] Release Apache Tika 1.15 Candidate #1
> >>
> >> Thanks Tim and apologies if this isn't the right thread to ask this
> >> question... any reason TIKA-2300 is not included despite
> >> FixVersions=1.15 on the ticket?
> >>
> >> On 22 May 2017 at 20:25, Tim Allison <tallison@apache.org> wrote:
> >>
> >> > A candidate for the Tika 1.15 release is available at:
> >> > https://dist.apache.org/repos/dist/dev/tika/
> >> >
> >> > The release candidate is a zip archive of the sources in:
> >> > https://github.com/apache/tika/tree/1.15-rc1
> >> >
> >> > The SHA1 checksum of the archive is
> >> > e82697a6804373367fbba98d47426ab74e036eb1.
> >> >
> >> > In addition, a staged maven repository is available here:
> >> > https://repository.apache.org/content/repositories/orgapachetika-10
> >> > 22
> >> >
> >> > Please vote on releasing this package as Apache Tika 1.15.
> >> > The vote is open for the next 72 hours and passes if a majority of
> >> > at least three +1 Tika PMC votes are cast.
> >> >
> >> > [ ] +1 Release this package as Apache Tika 1.15 [ ] -1 Do not
> >> > release this package because...
> >> >
> >> > ***This is my first time as release manager.  Please kick the tires
> >> > thoroughly.***
> >> >
> >> > This is my +1.
> >> >
> >> > Cheers,
> >> >
> >> > Tim
> >> >
> >>
> >>
> >>
> >> --
> >> Aeham Abushwashi
> >> Head of Engineering
> >> Exonar
> >>
> >> v: video.exonar.com  |  w: exonar.com <http://www.exonar.com/> |
> twitter:
> >> @exonar <https://twitter.com/exonar>
> >>
> >> GDPR: Why It’s About More Than Regulation: Download the White Paper
> >> Here < https://goo.gl/1cSVzH>
> >>
> >> Trial <https://www.exonar.com/platform/> the capability on your own
> >> organisation's data to understand what you've got, where it is and
> >> who has access to it.
> >>
> >>
> >> Come and meet us for a chat at Infosecurity Europe <
> >> http://www.infosecurityeurope.com/>on stand S07 in the Cyber
> >> Innovation Zone
> >> <http://www.infosecurityeurope.com/visit/whats-on/uk-cyber-
> >> innovation-zone/>
> >>
> >>
> >> Exonar Limited, registered in the UK, registration number 06439969 at
> >> 14 West Mills, Newbury, Berkshire, RG14 5HG. DISCLAIMER: This email
> >> and any attachments to it may be confidential or private. If you have
> >> received it in error, please notify us and delete it from your system.
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message