tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: [VOTE] Release Apache Tika 1.15 Candidate #1
Date Tue, 23 May 2017 23:53:35 GMT
Ugh.  Thank you!  Will re-spin for RC2 shortly.

-----Original Message-----
From: olegtikhonov@gmail.com [mailto:olegtikhonov@gmail.com] On Behalf Of Oleg Tikhonov
Sent: Tuesday, May 23, 2017 4:00 PM
To: dev@tika.apache.org
Subject: Re: [VOTE] Release Apache Tika 1.15 Candidate #1

Hi guys,
Here is wrong ...
<parent>
    <groupId>org.apache.tika</groupId>
    <artifactId>tika-parent</artifactId>
    <version>1.16-SNAPSHOT</version>
    <relativePath>tika-parent/pom.xml</relativePath>
  </parent>


If you are cloning the project, the upper level pom contains this.
The fix is to change 1.16-SNAPSHOT to 1.15

What i did was:
git clone https://github.com/apache/tika.git

Any suggestions?

BR,
OLeg




On Tue, May 23, 2017 at 3:01 PM, Allison, Timothy B. <tallison@mitre.org>
wrote:

> I _think_ it is included.  See below for the two options for parsing 
> testZipEncrypted.zip.
>
> Are you not seeing this behavior?  Were you expecting different behavior?
>
>
> 1) RecursiveParserWrapper
>
>         List<Metadata> metadataList = getRecursiveMetadata("
> testZipEncrypted.zip");
>         debug(metadataList);
>
> yields:
>
> 0: X-Parsed-By : org.apache.tika.parser.DefaultParser
> 0: X-Parsed-By : org.apache.tika.parser.pkg.PackageParser
> 0: X-TIKA:EXCEPTION:embedded_stream_exception : org.apache.tika.exception.EncryptedDocumentException:
> stream (encrypted.txt) is encrypted
>         at org.apache.tika.parser.pkg.PackageParser.parseEntry(
> PackageParser.java:306)
>         at org.apache.tika.parser.pkg.PackageParser.parse(
> PackageParser.java:230)
>         at org.apache.tika.parser.CompositeParser.parse(
> CompositeParser.java:280)
>         at org.apache.tika.parser.CompositeParser.parse(
> CompositeParser.java:280)
>         at org.apache.tika.parser.AutoDetectParser.parse(
> AutoDetectParser.java:135)
>         at org.apache.tika.parser.RecursiveParserWrapper.parse(
> RecursiveParserWrapper.java:158)
>         at org.apache.tika.TikaTest.getRecursiveMetadata(TikaTest.
> java:221)
>         at org.apache.tika.TikaTest.getRecursiveMetadata(TikaTest.
> java:213)
>         at org.apache.tika.parser.pkg.ZipParserTest.testZipEncrypted(
> ZipParserTest.java:213)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(
> FrameworkMethod.java:50)
>         at org.junit.internal.runners.model.ReflectiveCallable.run(
> ReflectiveCallable.java:12)
>         at org.junit.runners.model.FrameworkMethod.invokeExplosively(
> FrameworkMethod.java:47)
>         at org.junit.internal.runners.statements.InvokeMethod.
> evaluate(InvokeMethod.java:17)
>         at org.junit.internal.runners.statements.RunBefores.
> evaluate(RunBefores.java:26)
>         at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>         at org.junit.runners.BlockJUnit4ClassRunner.runChild(
> BlockJUnit4ClassRunner.java:78)
>         at org.junit.runners.BlockJUnit4ClassRunner.runChild(
> BlockJUnit4ClassRunner.java:57)
>         at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>         at org.junit.runners.ParentRunner.runChildren(
> ParentRunner.java:288)
>         at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>         at org.junit.runners.ParentRunner$2.evaluate(
> ParentRunner.java:268)
>         at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>         at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>         at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(
> JUnit4IdeaTestRunner.java:68)
>         at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.
> startRunnerWithArgs(IdeaTestRunner.java:51)
>         at com.intellij.rt.execution.junit.JUnitStarter.
> prepareStreamsAndStart(JUnitStarter.java:242)
>         at com.intellij.rt.execution.junit.JUnitStarter.main(
> JUnitStarter.java:70)
>
> 0: X-TIKA:parse_time_millis : 34
> 0: X-TIKA:content : <html xmlns="http://www.w3.org/1999/xhtml">
> <head>
> <meta name="X-Parsed-By" 
> content="org.apache.tika.parser.DefaultParser" /> <meta name="X-Parsed-By" content="org.apache.tika.parser.pkg.PackageParser"
> />
> <meta name="Content-Type" content="application/zip" /> <title></title>

> </head> <body><div class="embedded" id="unencrypted.txt" /> <div

> class="package-entry"><h1>unencrypted.txt</h1>
> </div>
> <p>encrypted.txt</p>
> </body></html>
> 0: Content-Type : application/zip
> 1: date : 2017-03-21T13:07:48Z
> 1: X-Parsed-By : org.apache.tika.parser.DefaultParser
> 1: X-Parsed-By : org.apache.tika.parser.txt.TXTParser
> 1: resourceName : unencrypted.txt
> 1: dcterms:modified : 2017-03-21T13:07:48Z
> 1: Last-Modified : 2017-03-21T13:07:48Z
> 1: Last-Save-Date : 2017-03-21T13:07:48Z
> 1: embeddedRelationshipId : unencrypted.txt
> 1: meta:save-date : 2017-03-21T13:07:48Z
> 1: Content-Encoding : windows-1252
> 1: X-TIKA:parse_time_millis : 3
> 1: modified : 2017-03-21T13:07:48Z
> 1: X-TIKA:content : <html xmlns="http://www.w3.org/1999/xhtml">
> <head>
> <meta name="date" content="2017-03-21T13:07:48Z" /> <meta 
> name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" /> 
> <meta name="X-Parsed-By" 
> content="org.apache.tika.parser.txt.TXTParser" /> <meta 
> name="resourceName" content="unencrypted.txt" /> <meta 
> name="dcterms:modified" content="2017-03-21T13:07:48Z" /> <meta 
> name="Last-Modified" content="2017-03-21T13:07:48Z" /> <meta 
> name="Last-Save-Date" content="2017-03-21T13:07:48Z" /> <meta 
> name="embeddedRelationshipId" content="unencrypted.txt" /> <meta 
> name="meta:save-date" content="2017-03-21T13:07:48Z" /> <meta 
> name="Content-Encoding" content="windows-1252" /> <meta 
> name="modified" content="2017-03-21T13:07:48Z" /> <meta 
> name="Content-Length" content="13" /> <meta 
> name="X-TIKA:embedded_resource_path" content="/unencrypted.txt" /> 
> <meta name="Content-Type" content="text/plain; charset=windows-1252" 
> /> <title></title> </head> <body><p>hello world </p>
</body></html>
> 1: Content-Length : 13
> 1: X-TIKA:embedded_resource_path : /unencrypted.txt
> 1: Content-Type : text/plain; charset=windows-1252
>
> 2) Classic XML:
>
>         XMLResult r = getXML("testZipEncrypted.zip");
>         for (String n : r.metadata.names()) {
>             for (String v : r.metadata.getValues(n)) {
>                 System.out.println("meta: "+n + " : "+v);
>             }
>         }
>         System.out.println(r.xml);
>
> Yields:
> meta: X-Parsed-By : org.apache.tika.parser.DefaultParser
> meta: X-Parsed-By : org.apache.tika.parser.pkg.PackageParser
> meta: X-TIKA:EXCEPTION:embedded_stream_exception :
> org.apache.tika.exception.EncryptedDocumentException: stream
> (encrypted.txt) is encrypted
>         at org.apache.tika.parser.pkg.PackageParser.parseEntry(
> PackageParser.java:306)
>         at org.apache.tika.parser.pkg.PackageParser.parse(
> PackageParser.java:230)
>         at org.apache.tika.parser.CompositeParser.parse(
> CompositeParser.java:280)
>         at org.apache.tika.parser.CompositeParser.parse(
> CompositeParser.java:280)
>         at org.apache.tika.parser.AutoDetectParser.parse(
> AutoDetectParser.java:135)
>         at org.apache.tika.TikaTest.getXML(TikaTest.java:205)
>         at org.apache.tika.TikaTest.getXML(TikaTest.java:191)
>         at org.apache.tika.parser.pkg.ZipParserTest.testZipEncrypted(
> ZipParserTest.java:206)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(
> FrameworkMethod.java:50)
>         at org.junit.internal.runners.model.ReflectiveCallable.run(
> ReflectiveCallable.java:12)
>         at org.junit.runners.model.FrameworkMethod.invokeExplosively(
> FrameworkMethod.java:47)
>         at org.junit.internal.runners.statements.InvokeMethod.
> evaluate(InvokeMethod.java:17)
>         at org.junit.internal.runners.statements.RunBefores.
> evaluate(RunBefores.java:26)
>         at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>         at org.junit.runners.BlockJUnit4ClassRunner.runChild(
> BlockJUnit4ClassRunner.java:78)
>         at org.junit.runners.BlockJUnit4ClassRunner.runChild(
> BlockJUnit4ClassRunner.java:57)
>         at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>         at org.junit.runners.ParentRunner.runChildren(
> ParentRunner.java:288)
>         at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>         at org.junit.runners.ParentRunner$2.evaluate(
> ParentRunner.java:268)
>         at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>         at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>         at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(
> JUnit4IdeaTestRunner.java:68)
>         at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.
> startRunnerWithArgs(IdeaTestRunner.java:51)
>         at com.intellij.rt.execution.junit.JUnitStarter.
> prepareStreamsAndStart(JUnitStarter.java:242)
>         at com.intellij.rt.execution.junit.JUnitStarter.main(
> JUnitStarter.java:70)
>
> meta: Content-Type : application/zip
> <html xmlns="http://www.w3.org/1999/xhtml">
> <head>
> <meta name="X-Parsed-By" 
> content="org.apache.tika.parser.DefaultParser" /> <meta name="X-Parsed-By" content="org.apache.tika.parser.pkg.PackageParser"
> />
> <meta name="Content-Type" content="application/zip" /> <title></title>

> </head> <body><div class="embedded" id="unencrypted.txt" /> <div

> class="package-entry"><h1>unencrypted.txt</h1>
> <p>hello world
> </p>
>
> </div>
> <p>encrypted.txt</p>
> </body></html>
>
> -----Original Message-----
> From: Aeham Abushwashi [mailto:aeham.abushwashi@exonar.com]
> Sent: Tuesday, May 23, 2017 3:47 AM
> To: user@tika.apache.org; Tim Allison <tallison@apache.org>
> Cc: dev@tika.apache.org
> Subject: Re: [VOTE] Release Apache Tika 1.15 Candidate #1
>
> Thanks Tim and apologies if this isn't the right thread to ask this 
> question... any reason TIKA-2300 is not included despite 
> FixVersions=1.15 on the ticket?
>
> On 22 May 2017 at 20:25, Tim Allison <tallison@apache.org> wrote:
>
> > A candidate for the Tika 1.15 release is available at:
> > https://dist.apache.org/repos/dist/dev/tika/
> >
> > The release candidate is a zip archive of the sources in:
> > https://github.com/apache/tika/tree/1.15-rc1
> >
> > The SHA1 checksum of the archive is
> > e82697a6804373367fbba98d47426ab74e036eb1.
> >
> > In addition, a staged maven repository is available here:
> > https://repository.apache.org/content/repositories/orgapachetika-102
> > 2
> >
> > Please vote on releasing this package as Apache Tika 1.15.
> > The vote is open for the next 72 hours and passes if a majority of 
> > at least three +1 Tika PMC votes are cast.
> >
> > [ ] +1 Release this package as Apache Tika 1.15 [ ] -1 Do not 
> > release this package because...
> >
> > ***This is my first time as release manager.  Please kick the tires
> > thoroughly.***
> >
> > This is my +1.
> >
> > Cheers,
> >
> > Tim
> >
>
>
>
> --
> Aeham Abushwashi
> Head of Engineering
> Exonar
>
> v: video.exonar.com  |  w: exonar.com <http://www.exonar.com/> | twitter:
> @exonar <https://twitter.com/exonar>
>
> GDPR: Why It’s About More Than Regulation: Download the White Paper 
> Here < https://goo.gl/1cSVzH>
>
> Trial <https://www.exonar.com/platform/> the capability on your own 
> organisation's data to understand what you've got, where it is and who 
> has access to it.
>
>
> Come and meet us for a chat at Infosecurity Europe <http://www.
> infosecurityeurope.com/>on stand S07 in the Cyber Innovation Zone < 
> http://www.infosecurityeurope.com/visit/whats-on/uk-cyber-innovation-z
> one/
> >
>
>
> Exonar Limited, registered in the UK, registration number 06439969 at 
> 14 West Mills, Newbury, Berkshire, RG14 5HG. DISCLAIMER: This email 
> and any attachments to it may be confidential or private. If you have 
> received it in error, please notify us and delete it from your system.
>
Mime
View raw message