uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marshall Schor (JIRA)" <...@uima.apache.org>
Subject [jira] [Updated] (UIMA-4135) support for modifying indexed FSs
Date Mon, 08 Dec 2014 14:50:13 GMT

     [ https://issues.apache.org/jira/browse/UIMA-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Marshall Schor updated UIMA-4135:
---------------------------------
    Description: 
Both users and the UIMA framework (during deserialization of CASes in XCAS, XMI, or various
Binary formats) may want to modify a feature in a FS which is used as a key in some index
specification.  If this FS is in the index, then indices which use this feature as a key may
become corrupted, unless the FS is first removed from the indices.  After that, the feature
may be updated, and the FS re-added to the indices.

If allow_multiple_add_to_indices is enabled, a particular FS may be added to indices multiple
times; the above remove operation would need to remove all of these, and the above add operation
would need to add-to-indices the same number as was removed.

The count of the number of times a FS was in the indices needs to be kept by View.

There are several optimizations possible for this operation.  Bag indices do not need to be
disturbed as they have no keys.  Set indices only have at most one instance of a particular
FS.  FSs which are subtypes of AnnotationBase are only indexed in at most 1 view (the view
of their sofa).  The remove-all kind of operation for Sorted indices can be made efficient
in that all the identical elements are stored adjacently, and the remove can be done in bulk.

The update operations for one FS may involve multiple key values.
Design a way to encapsulate the update operation that is efficient, for both users and the
UIMA framework, supporting both a try-finally approach and an encapsulation via a Runnable.

For example, for try - finally:
{code:java}
FSsTobeAddedBack addback = cas.protectIndices();
try {
     ... user code ...
} finally {
   addback.close();
}
{code}
or, with a Runnable (written in Java 8 lambda style):
{code:java}
cas.withProtectedIndices( () -> {
    ... user code ...
});
{code}

  was:
Both users and the UIMA framework (during deserialization of CASes in XCAS, XMI, or various
Binary formats) may want to modify a feature in a FS which is used as a key in some index
specification.  If this FS is in the index, then indices which use this feature as a key may
become corrupted, unless the FS is first removed from the indices.  After that, the feature
may be updated, and the FS re-added to the indices.

If allow_multiple_add_to_indices is enabled, a particular FS may be added to indices multiple
times; the above remove operation would need to remove all of these, and the above add operation
would need to add-to-indices the same number as was removed.

The count of the number of times a FS was in the indices needs to be kept by View.

There are several optimizations possible for this operation.  Bag indices do not need to be
disturbed as they have no keys.  Set indices only have at most one instance of a particular
FS.  FSs which are subtypes of AnnotationBase are only indexed in at most 1 view (the view
of their sofa).  The remove-all kind of operation for Sorted indices can be made efficient
in that all the identical elements are stored adjacently, and the remove can be done in bulk.

The update operations for one FS may involve multiple key values.
Design a way to encapsulate the update operation that is efficient, for both users and the
UIMA framework.


> support for modifying indexed FSs
> ---------------------------------
>
>                 Key: UIMA-4135
>                 URL: https://issues.apache.org/jira/browse/UIMA-4135
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>            Reporter: Marshall Schor
>            Assignee: Marshall Schor
>             Fix For: 2.7.0SDK
>
>
> Both users and the UIMA framework (during deserialization of CASes in XCAS, XMI, or various
Binary formats) may want to modify a feature in a FS which is used as a key in some index
specification.  If this FS is in the index, then indices which use this feature as a key may
become corrupted, unless the FS is first removed from the indices.  After that, the feature
may be updated, and the FS re-added to the indices.
> If allow_multiple_add_to_indices is enabled, a particular FS may be added to indices
multiple times; the above remove operation would need to remove all of these, and the above
add operation would need to add-to-indices the same number as was removed.
> The count of the number of times a FS was in the indices needs to be kept by View.
> There are several optimizations possible for this operation.  Bag indices do not need
to be disturbed as they have no keys.  Set indices only have at most one instance of a particular
FS.  FSs which are subtypes of AnnotationBase are only indexed in at most 1 view (the view
of their sofa).  The remove-all kind of operation for Sorted indices can be made efficient
in that all the identical elements are stored adjacently, and the remove can be done in bulk.
> The update operations for one FS may involve multiple key values.
> Design a way to encapsulate the update operation that is efficient, for both users and
the UIMA framework, supporting both a try-finally approach and an encapsulation via a Runnable.
> For example, for try - finally:
> {code:java}
> FSsTobeAddedBack addback = cas.protectIndices();
> try {
>      ... user code ...
> } finally {
>    addback.close();
> }
> {code}
> or, with a Runnable (written in Java 8 lambda style):
> {code:java}
> cas.withProtectedIndices( () -> {
>     ... user code ...
> });
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message