uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: [jira] [Created] (UIMA-4135) support for modifying indexed FSs
Date Tue, 02 Dec 2014 16:14:43 GMT
A subsequent discussion with Burn L. produced the following two good ideas:

1) The UIMA framework could automatically do the safe thing on each feature
modification that required it.  Although this might seem inefficient, it is
likely that in most cases, only one feature (used as a key in some index spec)
is being modified at any one time.  For those cases where this isn't true, the
alternative of a index protection block encapsulating multiple updates could be
used; but it's likely that would rarely be needed.

The automatic approach would, in effect, do a remove, modify, add-back cycle for
each feature modification, in all indices where the FS was in the index, if the
feature was used as a key.

This would be a boon to users - as their code would now work without the danger
of accidentally corrupting indices.

2) Because this would turn a feature update into (potentially) a remove - update
- add operation, users writing feature updates inside an interator would be
exposed to suddenly getting illegal index modification while iterating exceptions.

This has long been an issue, I think, causing users to write loops that extract
FSs into array lists and then iterate over those, while doing UIMA index adds/
removes.

How about we add a method to our iterator creation suite, perhaps named
safeIterator(), which creates a snapshot of the index its iterating over at the
start, and then allows the user code to do arbitrary index adds/removes?  It
seems this occurs frequently enough to warrant UIMA built-in support, and some
optimizations may be available. It seems it could be especially helpful if (1)
were implemented, because the remove/add could occur unbeknownst to the user. 
For example, the component writer may not have had a feature in any index, but
when his component was combined with others, an index could have been added that
used the feature.

WDYT?

-Marshall

On 12/2/2014 10:26 AM, Marshall Schor wrote:
> Richard, your good feedback set me thinking harder about this.  I think I agree
> that the try / finally is in some sense, optional, and serves (only) to execute
> the finally block in the presence of an exception.
>
> So, if we were to envisage a design without that, it might look as simple as this:
>
>   // put this at the start of a sequence where some index modifications might
>   // cause index corruption 
>   cas.beginIndexProtection();
>  
>   // User code, which does some modifications to existing FSs
>   // However, UIMA iterators which check for fast-fail may fail in this section
>  
>   // indicate the end of the index protected sequence
>   cas.endIndexProtection();
>
> Of course, you could optionally, use a Java 7 or 8 try / finally, to execute the
> last line in the presence of an exception. 
>
> Some notes: 
>
> 1) The protection might do a remove-from-indices some FSs being modified as
> needed, and then add them back.  at the end, which could cause iterators already
> in existence that move to other than first or last to fail.
>
> 2) The protection might have some alternative style for Java 8 to facilitate the
> try-with-resources.  It might look like this:
>
>    try (IndexProtection ip = cas.beginIndexProtection()) {
>       // user code
>    }
>
> where the IndexProtection class has a "close()" method which does the
> cas.endIndexProtection() call.
>
> 3) I think the design ought to support nested begin/end protection blocks, to
> facilitate subroutine modularity. For example, a separately developed subroutine
> might be called by user code; this subroutine might, in turn, do its own
> IndexProtection.
>
> 4) There is a danger with this design, in that a user could "forget" to put the
> endIndexProtection in.
>
> -----------------------
>
> A safer design would be along the lines that Richard suggested - of using
> "functional" (in Java 8 sense - having only one method) inner classes that
> encapsulate the user code. 
>
> With respect to how to handle checked exceptions: in Java 8, you can declare a
> "custom" functional interface that includes a throws clause, and use that to
> allow the lambda to throw checked exceptions.  But you would probably end up
> (since the user code might throw almost anything, from UIMAException to
> IOException) saying it throws the top superclass of checked exceptions:
> Exception.  That would require the caller to catch Exception (or include it in a
> throws clause).
>
> The alternative would be to not have checked exceptions, and to require the user
> code to encapsulate these as runtime exceptions.  For this particular use-case,
> the setting of feature structure slots doesn't throw checked exceptions I think.
>
> So, for that design, it would look like Richard suggested.
>
>    // slight change of name, to indicate the function being supplied in the
> encapsulation
>    // Java 8
>
>    cas.protectIndices( () -> {
>       // user code , no checked exceptions
>    });
>
>    // Java 7
>
>    cas.protectIndices( new Runnable() { void run() {
>         // user code, no checked exceptions
>     )});
>
> So, I'm now leaning toward this style (with no checked exceptions) as suggested
> by Richard, as the most reliable way: it prevents the user from "forgetting" to
> invoke endIndexProtection(), and is not too cumbersome to write even in Java 7.
>
> Other opinions?
>
> -Marshall
>
>
> On 12/1/2014 6:30 PM, Marshall Schor wrote:
>> The anonymous inner class has a nice property that with Java 8 you can use lambdas.
>>
>> A problem, though, I think is how to nicely handle thrown checked exceptions. 
>> With lambdas, I think you can't have checked exceptions.  With anonymous inner
>> classes, you can.  But of course the syntax is more difficult to understand.
>>
>> I'm not sure about the abuse part for try / finally.  (I'm not using try /catch
>> :-) ).  The try / finally is for the purpose of putting a block scope around
>> some code, and then executing some code at the end of a block, even if an
>> exception is thrown. 
>>
>> I need the signal of where the end of the block is, and to execute code there,
>> in order to add-back any FSs that might have been removed (if needed) in the
>> body of the code while doing the feature updates.
>>
>> It seems that the try / finally (or Java 8's try with resources) has a clearer
>> syntax for specifying this than anything else I've thought of (but maybe there's
>> still a better way :-) ).
>>
>> -Marshall
>>
>> On 12/1/2014 4:09 PM, Richard Eckart de Castilho wrote:
>>> On 01.12.2014, at 19:24, Marshall Schor <msa@schor.com> wrote:
>>>
>>>> One approach would use the try/ finally form:
>>>>
>>>>  controlVar = cas.startUimaIndexProtectedBlock();
>>>>  try {
>>>>    some code which modifies a FS (or maybe, multiple FSs
>>>>  } finally {
>>>>    controlVar.close();  // causes any "removes" to be now re-added to indices
>>>>  }
>>>>
>>>> A form like the above could use in Java 8 the simpler try-with-resources
form:
>>>>  try (controlVar = cas.startUimaIndexProtectedBlock()) {
>>>>    some code which modifies a FS (or maybe, multiple FSs
>>>>  }
>>> For me, this smells a but like abusing try/catch, although I admit that
>>> it also has some elegance.
>>>
>>> Why not use an anonymous inner class like this:
>>>
>>> cas.transaction(new Transaction<CAS>() {
>>>   void perform(CAS cas) {
>>>     // make modifications
>>>   }
>>> });
>>>
>>> Afaik this works also in Java versions prior to 7. It's the kind of thing
>>> one did before lambda arrived.
>>>
>>> Cheers,
>>>
>>> -- Richard
>>>
>>
>
>


Mime
View raw message