uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marshall Schor (JIRA)" <...@uima.apache.org>
Subject [jira] [Created] (UIMA-3399) More consistent handling of multiple add-to-index behavior for same Feature Structure
Date Fri, 01 Nov 2013 14:09:17 GMT
Marshall Schor created UIMA-3399:

             Summary: More consistent handling of multiple add-to-index behavior for same
Feature Structure
                 Key: UIMA-3399
                 URL: https://issues.apache.org/jira/browse/UIMA-3399
             Project: UIMA
          Issue Type: Brainstorming
    Affects Versions: 2.4.2SDK
            Reporter: Marshall Schor
            Assignee: Marshall Schor
            Priority: Minor

UIMA has a somewhat unusual indexing architecture.  You can define indexes (sorted, bag, set),
and then add / remove a feature structure (FS) to all of the defined indexes.

The design intention (I think) was to support the concept of a FS being indexed, or not. 
However, the current design allows some anomalies that behave inconsistently between code
being run "locally", versus as remote services (due to how serialization handles this).  Serialization
encodes only the concept of a FS being either in an index or not. 

The problem arises in the edge case where the same FS is added to the indexes multiple times.
 For local (non-remote) cases, for bag and sorted indexes, the same exact FS would be added
multiple times.  This would have the consequences:

-  Iterating would return multiple == FSs.
-  Remove from indexes of a multiply-added FS would reduce the number by 1; the FS would still
be in the index.

For the same code, running remotely, serialization would have "collapsed" the multiple additions
into one, so would behave differently.

A proposed improvement:  Change the behavior of "add-to-index" so that  subsequent add-to-indexes
of a same FS would be either a no-op, or a delete / re-add (to cover the case where some feature
values of the FS might have changed, and therefore leading to the need to re-index the FS).
 To cover users who might be exploiting the old behavior, we could have a framework context
flag to re-instate the older behavior.

This would better align how code running locally or remotely works.

What do people think about this idea?

This message was sent by Atlassian JIRA

View raw message