lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <mkhlud...@griddynamics.com>
Subject Re: [jira] [Commented] (SOLR-2961) DIH with threads and TikaEntityProcessor JDBC ISsue
Date Mon, 12 Dec 2011 18:50:52 GMT
hi,

Pls find jars and patches attached. Pls check and let me know whether it
works for you?
It should fix core multithreading issue (but a really in quick manner). And
ClassCastException as well.

Regards

On Sun, Dec 11, 2011 at 1:30 AM, David T. Webb <david.webb@brightmove.com>wrote:

>  Mikhail,
>
> I am not building from source, so I am not setup to apply the patch.  Is
> there any way you can provide the patched JAR file for me to use?  That
> would be a real time saver for me.  Thanks for the info.
>
> I had seen your Jira when I was searching for the solution, but it looked
> like it had been commited to the 3.4  branch already.  Thanks for the
> clarification.  Do you need me to +1 any particular jira to help our cause?
>
>
>  Sincerely,
> David Webb, President
> BrightMove, Inc. (www.brightmove.com)
> 320 High Tide Drive
> Suite 101B
> St Augustine Beach, FL 32080
> (904) 861-2396 x6050
>
>
> ------------------------------
> *From:* Mikhail Khludnev [mailto:mkhludnev@griddynamics.com]
> *Sent:* Sat 12/10/2011 2:59 PM
> *To:* dev@lucene.apache.org; David T. Webb; solr-user@lucene.apache.org
> *Subject:* Re: [jira] [Commented] (SOLR-2961) DIH with threads and
> TikaEntityProcessor JDBC ISsue
>
>
>
> On Sat, Dec 10, 2011 at 11:58 PM, Mikhail Khludnev <
> mkhludnev@griddynamics.com> wrote:
>
>> Hello David,
>>
>> I know about DIH thread problems. Some time ago I did quick fix patch for
>> 3.4, which passes tests. If you have some time pls try it.
>>
>> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201110.mbox/%3CCANGii8cOrWXsSvP9EYcRFX_mQBoVdatzRW%2BF0Cq2c%3D6sx8czZw%40mail.gmail.com%3E
>> I'm working on fixing it in trunk.
>>
>> But I've never seen that ClassCastException, it can be an another one bug.
>>
>> Regards
>>
>>
>> On Sat, Dec 10, 2011 at 10:35 PM, David Webb (Commented) (JIRA) <
>> jira@apache.org> wrote:
>>
>>>
>>>    [
>>> https://issues.apache.org/jira/browse/SOLR-2961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166926#comment-13166926]
>>>
>>> David Webb commented on SOLR-2961:
>>> ----------------------------------
>>>
>>> Weird note, when threads="2", processing continues even though the
>>> stacktraces are output to the logs.  When threads="6", when the error
>>> occues, the DIH process immediately stops and performs a rollback.
>>>
>>> This is preventing me from using DIH to load and maintain my production
>>> index.  Any help is greatly appreciated since I am now at the 11th hour. :)
>>>
>>> Solr and all components have been stellar up to this point. Great
>>> project!
>>>
>>> > DIH with threads and TikaEntityProcessor JDBC ISsue
>>> > ---------------------------------------------------
>>> >
>>> >                 Key: SOLR-2961
>>> >                 URL: https://issues.apache.org/jira/browse/SOLR-2961
>>> >             Project: Solr
>>> >          Issue Type: Bug
>>> >          Components: contrib - DataImportHandler
>>> >    Affects Versions: 3.4, 3.5
>>> >         Environment: Windows Server 2008, Apache Tomcat 6, Oracle 11g,
>>> ojdbc 11.2.0.1
>>> >            Reporter: David Webb
>>> >              Labels: dih, tika
>>> >         Attachments: data-config.xml
>>> >
>>> >
>>>  > I have a DIH Configuration that works great when I dont specify
>>> threads="X" in the root entity.  As soon as I give a value for threads, I
>>> get the following error messages in the stacktrace.  Please advise.
>>> > SEVERE: JdbcDataSource was not closed prior to finalize(), indicates a
>>> bug -- POSSIBLE RESOURCE LEAK!!!
>>> > Dec 10, 2011 1:18:33 PM
>>> org.apache.solr.handler.dataimport.JdbcDataSource closeConnection
>>> > SEVERE: Ignoring Error when closing connection
>>> > java.sql.SQLRecoverableException: IO Error: Socket closed
>>> >       at
>>> oracle.jdbc.driver.T4CConnection.logoff(T4CConnection.java:511)
>>> >       at
>>> oracle.jdbc.driver.PhysicalConnection.close(PhysicalConnection.java:3931)
>>> >       at
>>> org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:401)
>>> >       at
>>> org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:392)
>>> >       at
>>> org.apache.solr.handler.dataimport.JdbcDataSource.finalize(JdbcDataSource.java:380)
>>> >       at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
>>> >       at java.lang.ref.Finalizer.runFinalizer(Unknown Source)
>>> >       at java.lang.ref.Finalizer.access$100(Unknown Source)
>>> >       at java.lang.ref.Finalizer$FinalizerThread.run(Unknown Source)
>>> > Caused by: java.net.SocketException: Socket closed
>>> >       at java.net.SocketOutputStream.socketWrite(Unknown Source)
>>> >       at java.net.SocketOutputStream.write(Unknown Source)
>>> >       at oracle.net.ns.DataPacket.send(DataPacket.java:199)
>>> >       at oracle.net.ns.NetOutputStream.flush(NetOutputStream.java:211)
>>> >       at
>>> oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:227)
>>> >       at oracle.net.ns.NetInputStream.read(NetInputStream.java:175)
>>> >       at oracle.net.ns.NetInputStream.read(NetInputStream.java:100)
>>> >       at oracle.net.ns.NetInputStream.read(NetInputStream.java:85)
>>> >       at
>>> oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:123)
>>> >       at
>>> oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:79)
>>> >       at
>>> oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1122)
>>> >       at
>>> oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1099)
>>> >       at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:288)
>>> >       at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:191)
>>> >       at
>>> oracle.jdbc.driver.T4C7Ocommoncall.doOLOGOFF(T4C7Ocommoncall.java:61)
>>> >       at
>>> oracle.jdbc.driver.T4CConnection.logoff(T4CConnection.java:498)
>>> >       ... 8 more
>>> > Dec 10, 2011 1:18:34 PM
>>> org.apache.solr.handler.dataimport.ThreadedEntityProcessorWrapper nextRow
>>> > SEVERE: Exception in entity : null
>>> > org.apache.solr.handler.dataimport.DataImportHandlerException: Failed
>>> to initialize DataSource: f2
>>> >       at
>>> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
>>> >       at
>>> org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:333)
>>> >       at
>>> org.apache.solr.handler.dataimport.ContextImpl.getDataSource(ContextImpl.java:99)
>>> >       at
>>> org.apache.solr.handler.dataimport.ThreadedContext.getDataSource(ThreadedContext.java:66)
>>> >       at
>>> org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:101)
>>> >       at
>>> org.apache.solr.handler.dataimport.ThreadedEntityProcessorWrapper.nextRow(ThreadedEntityProcessorWrapper.java:84)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:446)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.run(DocBuilder.java:399)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:466)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.run(DocBuilder.java:399)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:466)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.access$000(DocBuilder.java:353)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner$1.run(DocBuilder.java:406)
>>> >       at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
>>> >       at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
>>> Source)
>>> >       at java.lang.Thread.run(Unknown Source)
>>> > Caused by: java.lang.ClassCastException:
>>> org.apache.solr.handler.dataimport.TikaEntityProcessor cannot be cast to
>>> org.apache.solr.handler.dataimport.EntityProcessorWrapper
>>> >       at
>>> org.apache.solr.handler.dataimport.FieldStreamDataSource.init(FieldStreamDataSource.java:58)
>>> >       at
>>> org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:331)
>>> >       ... 14 more
>>> > Dec 10, 2011 1:18:34 PM
>>> org.apache.solr.handler.dataimport.ThreadedEntityProcessorWrapper nextRow
>>> > SEVERE: Exception in entity : null
>>> > org.apache.solr.handler.dataimport.DataImportHandlerException: Failed
>>> to initialize DataSource: f2
>>> >       at
>>> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
>>> >       at
>>> org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:333)
>>> >       at
>>> org.apache.solr.handler.dataimport.ContextImpl.getDataSource(ContextImpl.java:99)
>>> >       at
>>> org.apache.solr.handler.dataimport.ThreadedContext.getDataSource(ThreadedContext.java:66)
>>> >       at
>>> org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:101)
>>> >       at
>>> org.apache.solr.handler.dataimport.ThreadedEntityProcessorWrapper.nextRow(ThreadedEntityProcessorWrapper.java:84)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:446)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.run(DocBuilder.java:399)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:466)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.run(DocBuilder.java:399)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.runAThread(DocBuilder.java:466)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner.access$000(DocBuilder.java:353)
>>> >       at
>>> org.apache.solr.handler.dataimport.DocBuilder$EntityRunner$1.run(DocBuilder.java:406)
>>> >       at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
>>> >       at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
>>> Source)
>>> >       at java.lang.Thread.run(Unknown Source)
>>> > Caused by: java.lang.ClassCastException:
>>> org.apache.solr.handler.dataimport.TikaEntityProcessor cannot be cast to
>>> org.apache.solr.handler.dataimport.EntityProcessorWrapper
>>> >       at
>>> org.apache.solr.handler.dataimport.FieldStreamDataSource.init(FieldStreamDataSource.java:58)
>>> >       at
>>> org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:331)
>>> >       ... 14 more
>>> > Dec 10, 2011 1:18:34 PM
>>> org.apache.solr.handler.dataimport.JdbcDataSource finalize
>>> > SEVERE: JdbcDataSource was not closed prior to finalize(), indicates a
>>> bug -- POSSIBLE RESOURCE LEAK!!!
>>> > Dec 10, 2011 1:18:34 PM
>>> org.apache.solr.handler.dataimport.ThreadedEntityProcessorWrapper nextRow
>>> > SEVERE: Exception in entity : null
>>>
>>> --
>>> This message is automatically generated by JIRA.
>>> If you think it was sent incorrectly, please contact your JIRA
>>> administrators:
>>> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
>>> For more information on JIRA, see:
>>> http://www.atlassian.com/software/jira
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> Developer
>> Grid Dynamics
>> tel. 1-415-738-8644
>> Skype: mkhludnev
>> <http://www.griddynamics.com/>
>>  <mkhludnev@griddynamics.com>
>>
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Developer
> Grid Dynamics
> tel. 1-415-738-8644
> Skype: mkhludnev
> <http://www.griddynamics.com/>
>  <mkhludnev@griddynamics.com>
>



-- 
Sincerely yours
Mikhail Khludnev
Developer
Grid Dynamics
tel. 1-415-738-8644
Skype: mkhludnev
<http://www.griddynamics.com>
 <mkhludnev@griddynamics.com>

Mime
View raw message