lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: Solr 6.4. Can't index MS Visio vsdx files
Date Mon, 06 Feb 2017 16:54:59 GMT
Shouldn't have taken you that much effort.  Sorry.

Y, I should probably get around to a patch for: https://issues.apache.org/jira/browse/SOLR-9552

Although, frankly, it might be time for Tika 1.15 shortly.

-----Original Message-----
From: Gytis Mikuciunas [mailto:gytmkc@gmail.com] 
Sent: Monday, February 6, 2017 11:15 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 6.4. Can't index MS Visio vsdx files

Tim, you saved my day ;)

now vsdx files were indexed successfully.

Thank you very much!!!

summary: as a workaround I have in solr-6.4.0\contrib\extraction\lib:

1. ooxml-schemas-1.3.jar instead of poi-ooxml-schemas-3.15.jar 2. curvesapi-1.03.jar


So, now I'm waiting when this will be implemented in a official version of solr/tika.

Regards,
Gytis

On Mon, Feb 6, 2017 at 4:16 PM, Allison, Timothy B. <tallison@mitre.org>
wrote:

> Argh.  Looks like we need to add curvesapi (BSD 3-clause) to Solr.
>
> For now, add this jar:
> https://mvnrepository.com/artifact/com.github.virtuald/curvesapi/1.03
>
> See also [1]
>
> [1] http://apache-poi.1045710.n5.nabble.com/support-for-
> reading-Microsoft-Visio-2013-vsdx-format-td5721500.html
>
> -----Original Message-----
> From: Gytis Mikuciunas [mailto:gytmkc@gmail.com]
> Sent: Monday, February 6, 2017 8:19 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr 6.4. Can't index MS Visio vsdx files
>
> sad, but didn't help.
>
> what I did:
>
> 1. stopped solr: bin\solr stop -p 80
> 2. removed poi-ooxml-schemas-3.15.jar from contrib\extraction\lib 3. 
> add ooxml-schemas-1.3.jar to contrib\extraction\lib 4. restarted solr: 
> bin\solr start -p 80 -m 4g 5. tried again to parse vsdx file:
>
> java -Dauto -Dc=db_new02 -Dport=80 -Dfiletypes=vsd,vsdx 
> -Drecursive=yes -jar example/exampledocs/post.jar "I:\Tools"
>
> SimplePostTool version 5.0.0
> Posting files to [base] url http://localhost:80/solr/db_new02/update...
> Entering auto mode. File endings considered are vsd,vsdx Entering 
> recursive mode, max depth=999, delay=0s Indexing directory I:\Tools (1 
> files, depth=0) POSTing file span ports.vsdx 
> (application/octet-stream) to [base]/extract
> SimplePostTool: WARNING: Solr returned an error #500 (Server Error) 
> for
> url:
> http://localhost:80/solr/db_new02/update/extract?resource.
> name=I%3A%5CTools%5Cspan+ports.vsdx
> SimplePostTool: WARNING: Response: <html> <head> <meta 
> http-equiv="Content-Type" content="text/html;charset=utf-8"/>
> <title>Error 500 Server Error</title>
> </head>
> <body><h2>HTTP ERROR 500</h2>
> <p>Problem accessing /solr/db_new02/update/extract. Reason:
> <pre>    Server Error</pre></p><h3>Caused
> by:</h3><pre>java.lang.NoClassDefFoundError: com/graphbuilder/curve/Point
>         at java.lang.Class.getDeclaredConstructors0(Native Method)
>         at java.lang.Class.privateGetDeclaredConstructors(Unknown Source)
>         at java.lang.Class.getConstructor0(Unknown Source)
>         at java.lang.Class.getDeclaredConstructor(Unknown Source)
>         at org.apache.poi.xdgf.util.ObjectFactory.put(
> ObjectFactory.java:34)
>         at
> org.apache.poi.xdgf.usermodel.section.geometry.
> GeometryRowFactory.&lt;clinit&gt;(GeometryRowFactory.java:39)
>         at
> org.apache.poi.xdgf.usermodel.section.GeometrySection.&lt;
> init&gt;(GeometrySection.java:55)
>         at
> org.apache.poi.xdgf.usermodel.XDGFSheet.&lt;init&gt;(XDGFSheet.java:77)
>         at
> org.apache.poi.xdgf.usermodel.XDGFShape.&lt;init&gt;(XDGFShape.java:113)
>         at
> org.apache.poi.xdgf.usermodel.XDGFShape.&lt;init&gt;(XDGFShape.java:107)
>         at
> org.apache.poi.xdgf.usermodel.XDGFBaseContents.onDocumentRead(
> XDGFBaseContents.java:82)
>         at
> org.apache.poi.xdgf.usermodel.XDGFMasterContents.onDocumentRead(
> XDGFMasterContents.java:66)
>         at
> org.apache.poi.xdgf.usermodel.XDGFMasters.onDocumentRead(
> XDGFMasters.java:101)
>         at
> org.apache.poi.xdgf.usermodel.XmlVisioDocument.onDocumentRead(
> XmlVisioDocument.java:106)
>         at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:190)
>         at
> org.apache.poi.xdgf.usermodel.XmlVisioDocument.&lt;init&gt;(
> XmlVisioDocument.java:79)
>         at
> org.apache.poi.xdgf.extractor.XDGFVisioExtractor.&lt;init&
> gt;(XDGFVisioExtractor.java:41)
>         at
> org.apache.poi.extractor.ExtractorFactory.createExtractor(
> ExtractorFactory.java:207)
>         at
> org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(
> OOXMLExtractorFactory.java:86)
>         at
> org.apache.tika.parser.microsoft.ooxml.OOXMLParser.
> parse(OOXMLParser.java:87)
>         at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>         at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>         at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
>         at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(
> ExtractingDocumentLoader.java:228)
>         at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(
> ContentStreamHandlerBase.java:68)
>         at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(
> RequestHandlerBase.java:166)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2306)
>         at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658)
>         at org.apache.solr.servlet.HttpSolrCall.call(
> HttpSolrCall.java:464)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:345)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:296)
>         at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilter(ServletHandler.java:1691)
>         at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>         at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:143)
>         at
> org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHandler.java:524)
>         at
> org.eclipse.jetty.server.session.SessionHandler.
> doHandle(SessionHandler.java:226)
>         at
> org.eclipse.jetty.server.handler.ContextHandler.
> doHandle(ContextHandler.java:1180)
>         at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>         at
> org.eclipse.jetty.server.session.SessionHandler.
> doScope(SessionHandler.java:185)
>         at
> org.eclipse.jetty.server.handler.ContextHandler.
> doScope(ContextHandler.java:1112)
>         at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:141)
>         at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
> ContextHandlerCollection.java:213)
>         at
> org.eclipse.jetty.server.handler.HandlerCollection.
> handle(HandlerCollection.java:119)
>         at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
>         at org.eclipse.jetty.server.Server.handle(Server.java:534)
>         at org.eclipse.jetty.server.HttpChannel.handle(
> HttpChannel.java:320)
>         at
> org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConnection.java:251)
>         at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(
> AbstractConnection.java:273)
>         at org.eclipse.jetty.io.FillInterest.fillable(
> FillInterest.java:95)
>         at
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChannelEndPoint.java:93)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> executeProduceConsume(ExecuteProduceConsume.java:303)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> produceConsume(ExecuteProduceConsume.java:148)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(
> ExecuteProduceConsume.java:136)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
> QueuedThreadPool.java:671)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(
> QueuedThreadPool.java:589)
>         at java.lang.Thread.run(Unknown Source) Caused by: java.lang.ClassNotFoundException:
> com.graphbuilder.curve.Point
>         at java.net.URLClassLoader.findClass(Unknown Source)
>         at java.lang.ClassLoader.loadClass(Unknown Source)
>         at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
>         at java.lang.ClassLoader.loadClass(Unknown Source)
>         ... 56 more
> </pre>
> <h3>Caused by:</h3><pre>java.lang.ClassNotFoundException:
> com.graphbuilder.curve.Point
>         at java.net.URLClassLoader.findClass(Unknown Source)
>         at java.lang.ClassLoader.loadClass(Unknown Source)
>         at java.net.FactoryURLClassLoader.loadClass(Unknown Source)
>         at java.lang.ClassLoader.loadClass(Unknown Source)
>         at java.lang.Class.getDeclaredConstructors0(Native Method)
>         at java.lang.Class.privateGetDeclaredConstructors(Unknown Source)
>         at java.lang.Class.getConstructor0(Unknown Source)
>         at java.lang.Class.getDeclaredConstructor(Unknown Source)
>         at org.apache.poi.xdgf.util.ObjectFactory.put(
> ObjectFactory.java:34)
>         at
> org.apache.poi.xdgf.usermodel.section.geometry.
> GeometryRowFactory.&lt;clinit&gt;(GeometryRowFactory.java:39)
>         at
> org.apache.poi.xdgf.usermodel.section.GeometrySection.&lt;
> init&gt;(GeometrySection.java:55)
>         at
> org.apache.poi.xdgf.usermodel.XDGFSheet.&lt;init&gt;(XDGFSheet.java:77)
>         at
> org.apache.poi.xdgf.usermodel.XDGFShape.&lt;init&gt;(XDGFShape.java:113)
>         at
> org.apache.poi.xdgf.usermodel.XDGFShape.&lt;init&gt;(XDGFShape.java:107)
>         at
> org.apache.poi.xdgf.usermodel.XDGFBaseContents.onDocumentRead(
> XDGFBaseContents.java:82)
>         at
> org.apache.poi.xdgf.usermodel.XDGFMasterContents.onDocumentRead(
> XDGFMasterContents.java:66)
>         at
> org.apache.poi.xdgf.usermodel.XDGFMasters.onDocumentRead(
> XDGFMasters.java:101)
>         at
> org.apache.poi.xdgf.usermodel.XmlVisioDocument.onDocumentRead(
> XmlVisioDocument.java:106)
>         at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:190)
>         at
> org.apache.poi.xdgf.usermodel.XmlVisioDocument.&lt;init&gt;(
> XmlVisioDocument.java:79)
>         at
> org.apache.poi.xdgf.extractor.XDGFVisioExtractor.&lt;init&
> gt;(XDGFVisioExtractor.java:41)
>         at
> org.apache.poi.extractor.ExtractorFactory.createExtractor(
> ExtractorFactory.java:207)
>         at
> org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(
> OOXMLExtractorFactory.java:86)
>         at
> org.apache.tika.parser.microsoft.ooxml.OOXMLParser.
> parse(OOXMLParser.java:87)
>         at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>         at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>         at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
>         at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(
> ExtractingDocumentLoader.java:228)
>         at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(
> ContentStreamHandlerBase.java:68)
>         at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(
> RequestHandlerBase.java:166)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2306)
>         at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658)
>         at org.apache.solr.servlet.HttpSolrCall.call(
> HttpSolrCall.java:464)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:345)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:296)
>         at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilter(ServletHandler.java:1691)
>         at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>         at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:143)
>         at
> org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHandler.java:524)
>         at
> org.eclipse.jetty.server.session.SessionHandler.
> doHandle(SessionHandler.java:226)
>         at
> org.eclipse.jetty.server.handler.ContextHandler.
> doHandle(ContextHandler.java:1180)
>         at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>         at
> org.eclipse.jetty.server.session.SessionHandler.
> doScope(SessionHandler.java:185)
>         at
> org.eclipse.jetty.server.handler.ContextHandler.
> doScope(ContextHandler.java:1112)
>         at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:141)
>         at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
> ContextHandlerCollection.java:213)
>         at
> org.eclipse.jetty.server.handler.HandlerCollection.
> handle(HandlerCollection.java:119)
>         at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
>         at org.eclipse.jetty.server.Server.handle(Server.java:534)
>         at org.eclipse.jetty.server.HttpChannel.handle(
> HttpChannel.java:320)
>         at
> org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConnection.java:251)
>         at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(
> AbstractConnection.java:273)
>         at org.eclipse.jetty.io.FillInterest.fillable(
> FillInterest.java:95)
>         at
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChannelEndPoint.java:93)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> executeProduceConsume(ExecuteProduceConsume.java:303)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.
> produceConsume(ExecuteProduceConsume.java:148)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(
> ExecuteProduceConsume.java:136)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
> QueuedThreadPool.java:671)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(
> QueuedThreadPool.java:589)
>         at java.lang.Thread.run(Unknown Source) </pre>
>
> </body>
> </html>
>
>
Mime
View raw message