uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marshall Schor (JIRA)" <...@uima.apache.org>
Subject [jira] [Commented] (UIMA-1970) Reload updated resources while an analysis pipeline is running
Date Wed, 07 Sep 2016 13:58:20 GMT

    [ https://issues.apache.org/jira/browse/UIMA-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15470717#comment-15470717

Marshall Schor commented on UIMA-1970:

These are interesting ideas.  I think Jörn was mentioning UIMA-AS just as an example... 
so I'm not sure this belongs there?

The general rule on whether or not some feature / capability gets added to UIMA framework,
or left for users to grapple with, has been some judgement on whether or not the feature /
capability would be generally useful for a wider audience, and therefore, merit a careful
/ robust (insert more adjectives here) implementation that might be individually done.

The use case Jörn mentions, of "resources" changing more frequently than annotators, seems
common enough.  To implement such a thing seems to require:
* a "trigger"
* a way to update

The trigger could be some (user) code run every so often; the way to update could be some
user-code called when the trigger fires.  These could be additional APIs the framework would
call in a resource.

Very common cases (if there are any) could be handled by framework code:  e.g. getting some
signal from a file system when a file was updated.

Resources are typically run on potentially multiple threads, so this would need to be taken
into account.

> Reload updated resources while an analysis pipeline is running
> --------------------------------------------------------------
>                 Key: UIMA-1970
>                 URL: https://issues.apache.org/jira/browse/UIMA-1970
>             Project: UIMA
>          Issue Type: New Feature
>          Components: Async Scaleout
>            Reporter: Joern Kottmann
>              Labels: Resources
> In many use cases resources used by Analysis Engine are updated more frequently than
the AE implementation.
> Samples of such resources are huge dictionaries which are extended with new names on
a daily basis
> or statistical models which are re-trained frequently to be adjusted to data from the
past minutes.  
> For a resource updates an analysis pipeline has to be stopped and started manually to
pick up the updated resource.
> It would be nice if the framework could automate this procedure. In order to do that
it must have a capability to detect
> updated resources and perform the re-initialization of affected Analysis Engines.
> If the pipelines run multiple instances of these Analysis Engines it would be nice if
a rolling
> update could be performed in a way that the processing of CASes never comes to an halt.
> That should be handled in different issues for the CPE and UIMA-AS. 

This message was sent by Atlassian JIRA

View raw message