tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: NER Parser tests behind proxy?
Date Tue, 24 Nov 2015 14:07:53 GMT
Y, you do, but you (or I) can set the proxy for Maven correctly and (without the NER requirement)
the build works fine.

***WARNING, what I'm running into might very well just be user error in not telling Maven
to pass the proxy info to Groovy...this is why I didn't open an issue :) I've done some googling,
but haven't found an answer to this.***

In response to Thamme's questions:
>> Which is better?
>> 1. List 'access to opennlp.sourceforge.net' as a requirement  
I have access without a problem via regular means, the problem is that Maven isn't passing
proxy information into Groovy when it tries to make the call to get the document (I confirmed
this by dumping system props within ModelGetter).  Perhaps we just document that you need
to download the four model files manually and stick them in the right subdirectory if you
are behind a proxy (ugly solution, but would probably work)?


>>2. Package and deploy models as a maven artifact
Are there licensing issues for the current models?  Are the current models ASLv2.0?  Would
we need all four full models?  And, y, my suggestion was to build a very small model and push
it to source control in the resources directory.

All this said, 1) again, this could be user error and 2) the addition of Stanford NER is fantastic...Thank
you for this addition!


-----Original Message-----
From: Mattmann, Chris A (3980) [mailto:chris.a.mattmann@jpl.nasa.gov] 
Sent: Monday, November 23, 2015 11:12 AM
To: dev@tika.apache.org
Cc: ThammeGowda Narayanaswamy <thammegowda.n@usc.edu>
Subject: Re: NER Parser tests behind proxy?

Hey Tim,

Why shouldn’t we have to worry
about connectivity outside of the Maven stuff? I mean clearly, if I install Tika on a new
system today without a Maven repo, I must be connected to the internet, right?

Cheers,
Chris



-----Original Message-----
From: "Allison, Timothy B." <tallison@mitre.org>
Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
Date: Monday, November 23, 2015 at 8:03 AM
To: "dev@tika.apache.org" <dev@tika.apache.org>
Cc: ThammeGowda Narayanaswamy <thammegowda.n@usc.edu>
Subject: RE: NER Parser tests behind proxy?

>The problem comes down to: ModelGetter.groovy which is trying to grab:
>${basedir}/src/test/resources/org/apache/tika/parser/ner/opennlp/ner-pe
>rso
>n.bin
>
>If we could build a small model (and I mean really small) and package 
>it with Tika, we wouldn't have to worry about http connectivity outside 
>of the usual maven stuff.
>
>-----Original Message-----
>From: Mattmann, Chris A (3980) [mailto:chris.a.mattmann@jpl.nasa.gov]
>Sent: Monday, November 23, 2015 10:52 AM
>To: dev@tika.apache.org
>Cc: ThammeGowda Narayanaswamy <thammegowda.n@usc.edu>
>Subject: Re: NER Parser tests behind proxy?
>
>Hey Tim,
>
>I’m not seeing these of course b/c I’m not behind a proxy. Thamme, any 
>ideas?
>
>Cheers,
>Chris
>
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Chris Mattmann, Ph.D.
>Chief Architect
>Instrument Software and Science Data Systems Section (398) NASA Jet 
>Propulsion Laboratory Pasadena, CA 91109 USA
>Office: 168-519, Mailstop: 168-527
>Email: chris.a.mattmann@nasa.gov
>WWW:  http://sunset.usc.edu/~mattmann/
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Adjunct Associate Professor, Computer Science Department University of 
>Southern California, Los Angeles, CA 90089 USA
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>-----Original Message-----
>From: "Allison, Timothy B." <tallison@mitre.org>
>Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
>Date: Thursday, November 19, 2015 at 5:36 PM
>To: "dev@tika.apache.org" <dev@tika.apache.org>
>Subject: NER Parser tests behind proxy?
>
>>My proxy is configured for git/maven/etc, but how do I configure it 
>>within the test so that I don't get this?
>>
>>GET : http://opennlp.sourceforge.net/models-1.5/en-ner-person.bin ->
>>tika-parsers\src\test\resources\org\apache\tika\parser\ner\opennlp\ner
>>-
>>per
>>son.bin
>>[INFO]
>>----------------------------------------------------------------------
>>-
>>-
>>[INFO] Reactor Summary:
>>[INFO]
>>[INFO] Apache Tika parent ................................ SUCCESS 
>>[3.264s] [INFO] Apache Tika core ..................................
>>SUCCESS [44.470s] [INFO] Apache Tika parsers 
>>............................... FAILURE [1:56.462s] [INFO] Apache Tika 
>>XMP ................................... SKIPPED [INFO] Apache Tika 
>>serialization ......................... SKIPPED [INFO] Apache Tika 
>>batch ................................. SKIPPED [INFO] Apache Tika 
>>application ........................... SKIPPED [INFO] Apache Tika 
>>OSGi bundle ........................... SKIPPED [INFO] Apache Tika 
>>translate ............................. SKIPPED [INFO] Apache Tika 
>>server ................................ SKIPPED [INFO] Apache Tika 
>>examples .............................. SKIPPED [INFO] Apache Tika 
>>Java-7 Components ..................... SKIPPED [INFO] Apache Tika 
>>....................................... SKIPPED [INFO]
>>----------------------------------------------------------------------
>>-
>>-
>>[INFO] BUILD FAILURE
>>[INFO]
>>----------------------------------------------------------------------
>>-
>>-
>>[INFO] Total time: 2:45.245s
>>[INFO] Finished at: Thu Nov 19 20:29:34 EST 2015 [INFO] Final Memory:
>>52M/482M [INFO]
>>----------------------------------------------------------------------
>>-
>>-
>>[ERROR] Failed to execute goal
>>org.codehaus.groovy.maven:gmaven-plugin:1.0:execute (testSetup) on 
>>project tika-parsers: java.net.ConnectException: Connection refused:
>>connect -> [Help 1]
>>org.apache.maven.lifecycle.LifecycleExecutionException: Failed to 
>>execute goal org.codehaus.groovy.maven:gmaven-plugin:1.0:execute
>>(testSetup) on project tika-parsers: java.net.ConnectException:
>>Connection refused:
>>connect
>>	at
>>org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.
>>j
>>ava
>>:217)
>>	at
>>org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.
>>j
>>ava
>>:153)
>>	at
>>org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.
>>j
>>ava
>>:145)
>>	at
>>org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProjec
>>t
>>(Li
>>fecycleModuleBuilder.java:84)
>>	at
>>org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProjec
>>t
>>(Li
>>fecycleModuleBuilder.java:59)
>>	at
>>org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBui
>>l
>>d(L
>>ifecycleStarter.java:183)
>>	at
>>org.apache.maven.lifecycle.internal.LifecycleStarter.execute(Lifecycle
>>S
>>tar
>>ter.java:161)
>>	at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
>>	at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
>>	at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)
>>	at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)
>>	at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
>>	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>	at
>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>ava
>>:
>>62)
>>	at
>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>o
>>rIm
>>pl.java:43)
>>	at java.lang.reflect.Method.invoke(Method.java:497)
>>	at
>>org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launc
>>her
>>.
>>java:290)
>>	at
>>org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:
>>230
>>)
>>	at
>>org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Lau
>>n
>>che
>>r.java:409)
>>	at
>>org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352)
>>	at org.codehaus.classworlds.Launcher.main(Launcher.java:47)
>>	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>	at
>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>ava
>>:
>>62)
>>	at
>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>o
>>rIm
>>pl.java:43)
>>	at java.lang.reflect.Method.invoke(Method.java:497)
>>	at
>>com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
>>Caused by: org.apache.maven.plugin.MojoExecutionException:
>>java.net.ConnectException: Connection refused: connect
>>	at
>>org.codehaus.groovy.maven.plugin.MojoSupport.execute(MojoSupport.java:85)
>>	at
>>org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultB
>>u
>>ild
>>PluginManager.java:101)
>>	at
>>org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.
>>j
>>ava
>>:209)
>>	... 25 more
>>Caused by: org.codehaus.groovy.maven.feature.ComponentException:
>>java.net.ConnectException: Connection refused: connect
>>	at
>>org.codehaus.groovy.maven.runtime.support.ScriptExecutorSupport.invoke
>>M
>>eth
>>od(ScriptExecutorSupport.java:162)
>>	at
>>org.codehaus.groovy.maven.runtime.support.ScriptExecutorSupport.execut
>>e
>>(Sc
>>riptExecutorSupport.java:126)
>>	at
>>org.codehaus.groovy.maven.runtime.support.ScriptExecutorSupport.execut
>>e
>>(Sc
>>riptExecutorSupport.java:73)
>>	at
>>org.codehaus.groovy.maven.plugin.execute.ExecuteMojo.process(ExecuteMo
>>j
>>o.j
>>ava:249)
>>	at
>>org.codehaus.groovy.maven.plugin.ComponentMojoSupport.doExecute(Compon
>>e
>>ntM
>>ojoSupport.java:60)
>>	at
>>org.codehaus.groovy.maven.plugin.MojoSupport.execute(MojoSupport.java:69)
>>	... 27 more
>>Caused by: java.net.ConnectException: Connection refused: connect
>>	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>>	at
>>sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructo
>>r
>>Acc
>>essorImpl.java:62)
>>	at
>>sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingCo
>>n
>>str
>>uctorAccessorImpl.java:45)
>>	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>>	at
>>sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.j
>>ava
>>:
>>1890)
>>	at
>>sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.j
>>ava
>>:
>>1885)
>>	at java.security.AccessController.doPrivileged(Native Method)
>>	at
>>sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpUR
>>L
>>Con
>>nection.java:1884)
>>	at
>>sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLCon
>>n
>>ect
>>ion.java:1457)
>>	at
>>sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConn
>>e
>>cti
>>on.java:1441)
>>	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>	at
>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>ava
>>:
>>62)
>>	at
>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>o
>>rIm
>>pl.java:43)
>>	at java.lang.reflect.Method.invoke(Method.java:497)
>>	at
>>org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoCachedMeth
>>o
>>dSi
>>teNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:229)
>>	at
>>org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaM
>>e
>>tho
>>dSite.java:52)
>>	at
>>org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSit
>>e
>>Arr
>>ay.java:43)
>>	at
>>org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCal
>>l
>>Sit
>>e.java:116)
>>	at
>>org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCal
>>l
>>Sit
>>e.java:120)
>>	at ModelGetter.downloadFile(ModelGetter.groovy:64)
>>	at ModelGetter$downloadFile.callCurrent(Unknown Source)
>>	at
>>org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(
>>C
>>all
>>SiteArray.java:47)
>>	at
>>org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(Abst
>>r
>>act
>>CallSite.java:142)
>>	at
>>org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(Abst
>>r
>>act
>>CallSite.java:154)
>>	at ModelGetter.run(ModelGetter.groovy:91)
>>	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>	at
>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>ava
>>:
>>62)
>>	at
>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>o
>>rIm
>>pl.java:43)
>>	at java.lang.reflect.Method.invoke(Method.java:497)
>>	at
>>org.codehaus.groovy.maven.runtime.support.ScriptExecutorSupport.invoke
>>M
>>eth
>>od(ScriptExecutorSupport.java:158)
>>	... 32 more
>>Caused by: java.net.ConnectException: Connection refused: connect
>>	at java.net.DualStackPlainSocketImpl.connect0(Native Method)
>>	at
>>java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketIm
>>p
>>l.j
>>ava:79)
>>	at
>>java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.jav
>>a
>>:35
>>0)
>>	at
>>java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketI
>>mpl
>>.
>>java:206)
>>	at
>>java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:
>>188
>>)
>>	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
>>	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>	at java.net.Socket.connect(Socket.java:589)
>>	at java.net.Socket.connect(Socket.java:538)
>>	at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>>	at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
>>	at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
>>	at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
>>	at sun.net.www.http.HttpClient.New(HttpClient.java:308)
>>	at sun.net.www.http.HttpClient.New(HttpClient.java:326)
>>	at
>>sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLCo
>>n
>>nec
>>tion.java:1169)
>>	at
>>sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConne
>>c
>>tio
>>n.java:1105)
>>	at
>>sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnec
>>t
>>ion
>>.java:999)
>>	at
>>sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.
>>j
>>ava
>>:933)
>>	at
>>sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLCon
>>n
>>ect
>>ion.java:1513)
>>	at
>>sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConn
>>e
>>cti
>>on.java:1441)
>>	at
>>sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConn
>>e
>>cti
>>on.java:2943)
>>	at java.net.URLConnection.getHeaderFieldLong(URLConnection.java:629)
>>	at java.net.URLConnection.getContentLengthLong(URLConnection.java:501)
>>	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>	at
>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
>>ava
>>:
>>62)
>>	at
>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
>>o
>>rIm
>>pl.java:43)
>>	at java.lang.reflect.Method.invoke(Method.java:497)
>>	at
>>org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoCachedMeth
>>o
>>dSi
>>teNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:229)
>>	at
>>org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaM
>>e
>>tho
>>dSite.java:52)
>>	at
>>org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSit
>>e
>>Arr
>>ay.java:43)
>>	at
>>org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCal
>>l
>>Sit
>>e.java:116)
>>	at
>>org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCal
>>l
>>Sit
>>e.java:120)
>>	at ModelGetter.downloadFile(ModelGetter.groovy:61)
>>	... 42 more
>>
>>-----Original Message-----
>>From: Nick Burch [mailto:apache@gagravarr.org]
>>Sent: Thursday, November 19, 2015 7:41 PM
>>To: dev@tika.apache.org
>>Subject: Re: [DISCUSS] Moving to Git
>>
>>On Thu, 19 Nov 2015, Mattmann, Chris A (3980) wrote:
>>> I’ll be happy to update our docs and to write a wiki page on using 
>>> Tika & Git that we can refer folks to. I think I’ve demonstrated 
>>> documenting things on the Tika wiki :)
>>
>>Great stuff! Scribble something sensible down, and I can vote +1 to 
>>the move, plus learn more about Git at the same time :)
>>
>>Nick
>

Mime
View raw message