uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tommaso Teofili <tommaso.teof...@gmail.com>
Subject Re: [jira] [Commented] (UIMA-2373) Possible bug in FixedFlowController
Date Fri, 17 Feb 2012 22:29:12 GMT
Hi Marshall,

2012/2/17 Marshall Schor <msa@schor.com>

> Our docs say that AE's are run in a single thread model (see
> http://uima.apache.org/d/**uimaj-2.4.0/tutorials_and_**
> users_guides.html#ugr.tug.aae.**contract_for_annotator_methods<http://uima.apache.org/d/uimaj-2.4.0/tutorials_and_users_guides.html#ugr.tug.aae.contract_for_annotator_methods>
> **).  If multiple threads are wanted, the framework supports this by
> making multiple instances of the AE's implementation class.  This limits
> "thread-safety" issues to only "static" or class-level fields.
>
> The reason for this was an observation that the people writing annotators,
> although skilled in their particular discipline and able to write code that
> extracted information from Unstructured data, did not typically have the
> skills needed to write correct multi-threaded implementations in Java.  So
> the framework "helped" here, by insuring that any parallelism the framework
> supported created multiple instances of the annotator class, for each
> thread.
>
> I believe, however, that it is currently possible to use the framework in
> ways in which the application writer creates multiple threads and calls the
> same annotator instance on multiple threads at the same time.  Perhaps a
> proper approach here would be to have the framework detect this, and signal
> some kind of error.


I agree with this latest sentence, detection of such situations would be
very helpful in my opinion.
Tommaso



>
>
> -Marshall
>
>
> On 2/17/2012 4:09 AM, Tommaso Teofili (Commented) (JIRA) wrote:
>
>>     [ https://issues.apache.org/**jira/browse/UIMA-2373?page=**
>> com.atlassian.jira.plugin.**system.issuetabpanels:comment-**
>> tabpanel&focusedCommentId=**13210151#comment-13210151<https://issues.apache.org/jira/browse/UIMA-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210151#comment-13210151>]
>>
>> Tommaso Teofili commented on UIMA-2373:
>> ------------------------------**---------
>>
>> bq. Possibly a concurrency issue?
>>
>> Yes, I think so.
>> That came out when an AE is used from different clients which execute in
>> parallel, so I wonder if is the usage which is wrong  or we should allow
>> that and thus made a fix for it.
>>
>>  Possible bug in FixedFlowController
>>> ------------------------------**-----
>>>
>>>                 Key: UIMA-2373
>>>                 URL: https://issues.apache.org/**jira/browse/UIMA-2373<https://issues.apache.org/jira/browse/UIMA-2373>
>>>             Project: UIMA
>>>          Issue Type: Bug
>>>    Affects Versions: 2.4.0SDK
>>>            Reporter: Tommaso Teofili
>>>
>>> I am developing a series of Lucene tokenizers which can use UIMA for
>>> creating tokens via extracted annotations.
>>> While doing a stress test with lots of different strings I experienced
>>> the following:
>>> {noformat}
>>> [junit] Testsuite: org.apache.lucene.analysis.**
>>> uima.UIMATypeAwareAnalyzerTest
>>>     [junit] Tests run: 2, Failures: 0, Errors: 1, Time elapsed: 92,061
>>> sec
>>>     [junit]
>>>     [junit] ------------- Standard Error -----------------
>>>     [junit] The following exceptions were thrown by threads:
>>>     [junit] *** Thread: Thread-9 ***
>>>     [junit] java.lang.RuntimeException: java.io.IOException:
>>> org.apache.uima.analysis_**engine.**AnalysisEngineProcessException
>>>     [junit]    at org.apache.lucene.analysis.**BaseTokenStreamTestCase$*
>>> *AnalysisThread.run(**BaseTokenStreamTestCase.java:**289)
>>>     [junit] Caused by: java.io.IOException: org.apache.uima.analysis_**
>>> engine.**AnalysisEngineProcessException
>>>     [junit]    at org.apache.lucene.analysis.**uima.**
>>> UIMATypeAwareAnnotationsTokeni**zer.incrementToken(**
>>> UIMATypeAwareAnnotationsTokeni**zer.java:87)
>>>     [junit]    at org.apache.lucene.analysis.**BaseTokenStreamTestCase.*
>>> *assertTokenStreamContents(**BaseTokenStreamTestCase.java:**121)
>>>     [junit]    at org.apache.lucene.analysis.**BaseTokenStreamTestCase.*
>>> *checkRandomData(**BaseTokenStreamTestCase.java:**371)
>>>     [junit]    at org.apache.lucene.analysis.**BaseTokenStreamTestCase.*
>>> *checkRandomData(**BaseTokenStreamTestCase.java:**295)
>>>     [junit]    at org.apache.lucene.analysis.**BaseTokenStreamTestCase$*
>>> *AnalysisThread.run(**BaseTokenStreamTestCase.java:**287)
>>>     [junit] Caused by: org.apache.uima.analysis_**engine.**
>>> AnalysisEngineProcessException
>>>     [junit]    at org.apache.uima.analysis_**engine.asb.impl.ASB_impl$**
>>> AggregateCasIterator.**processUntilNextOutputCas(ASB_**impl.java:701)
>>>     [junit]    at org.apache.uima.analysis_**engine.asb.impl.ASB_impl$**
>>> AggregateCasIterator.<init>(**ASB_impl.java:409)
>>>     [junit]    at org.apache.uima.analysis_**engine.asb.impl.ASB_impl.**
>>> process(ASB_impl.java:342)
>>>     [junit]    at org.apache.uima.analysis_**engine.impl.**
>>> AggregateAnalysisEngine_impl.**processAndOutputNewCASes(**
>>> AggregateAnalysisEngine_impl.**java:267)
>>>     [junit]    at org.apache.uima.analysis_**engine.impl.**
>>> AnalysisEngineImplBase.**process(**AnalysisEngineImplBase.java:**267)
>>>     [junit]    at org.apache.lucene.analysis.**uima.BaseUIMATokenizer.**
>>> analyzeInput(**BaseUIMATokenizer.java:57)
>>>     [junit]    at org.apache.lucene.analysis.**uima.**
>>> UIMATypeAwareAnnotationsTokeni**zer.analyzeText(**
>>> UIMATypeAwareAnnotationsTokeni**zer.java:73)
>>>     [junit]    at org.apache.lucene.analysis.**uima.**
>>> UIMATypeAwareAnnotationsTokeni**zer.incrementToken(**
>>> UIMATypeAwareAnnotationsTokeni**zer.java:85)
>>>     [junit]    ... 4 more
>>>     [junit] Caused by: java.lang.**IndexOutOfBoundsException: Index: 1,
>>> Size: 2
>>>     [junit]    at java.util.ArrayList.**RangeCheck(ArrayList.java:547)
>>>     [junit]    at java.util.ArrayList.get(**ArrayList.java:322)
>>>     [junit]    at org.apache.uima.flow.impl.**FixedFlowController$**
>>> FixedFlowObject.next(**FixedFlowController.java:216)
>>>     [junit]    at org.apache.uima.analysis_**
>>> engine.asb.impl.FlowContainer.**next(FlowContainer.java:98)
>>>     [junit]    at org.apache.uima.analysis_**engine.asb.impl.ASB_impl$**
>>> AggregateCasIterator.**processUntilNextOutputCas(ASB_**impl.java:667)
>>>     [junit]    ... 11 more
>>> {noformat}
>>> I'm debugging it and see if I can come up with the exact bug (and fix) :)
>>>
>> --
>> This message is automatically generated by JIRA.
>> If you think it was sent incorrectly, please contact your JIRA
>> administrators: https://issues.apache.org/**jira/secure/**
>> ContactAdministrators!default.**jspa<https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa>
>> For more information on JIRA, see: http://www.atlassian.com/**
>> software/jira <http://www.atlassian.com/software/jira>
>>
>>
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message