uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marshall Schor (JIRA)" <...@uima.apache.org>
Subject [jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
Date Mon, 17 Oct 2016 14:51:58 GMT

    [ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15582461#comment-15582461

Marshall Schor commented on UIMA-5115:

Richard made many useful comments to the first cut of the select documentation (pdf) using
the Adobe commenting tool.  I'm bring some of those into the Jira (others are good suggestions
that I'll incorporate in next revision).

# Defaults:
#* change default for AnnotationIndex processing to non-overlapped (unambiguous); I'm not
sure about this.  I agree that in most use cases, the situation will be that there are no
overlapping annotations (imagining Sentence with non-overlapping Tokens).  But if a pipeline
did produce some overlapping Tokens, this default would "silently" skip those.  I think this
action should not be so "silent", to lessen the chance of mistakes in assumptions made by
downstream users of upstream annotators.
#* endWithinBounds - I agree with the comment, and in fact, the default (not clearly expressed)
was changed in the code to be as suggested.  I'm thinking of a rename like "includeEndBeyondBounds";
I suspect it will get very little (if any) use so the long name won't be significant.
#* skipEquals - this is poorly documented.  The implementation **never** includes the "bound",
because both the Subiterator and uimaFIT implementations never included the "bound"; it was
not the intent of this to sometimes include the bound.  So, it needs to be renamed.
#** These two implementations differed in what they meant by the "bound", however.  In uimaFIT,
 the Feature Structure to be skipped was the one which was exactly == (had the same "id")
as the bound Feature Structure.  In Subiterator, the ones that were skipped were the ones
which compared as "equal" using the annotation index's comparator function (which used type
priority).   What this boolean switch was trying to do was to allow specifying which of these
two equal meanings was to be used in doing the skipping.  Note that this is a detail that
only applies when there are potentially multiple Annotations which compare equal.  
# General approach to handling ignored or not-applicable settings: I am slightly favoring
some kind of notification, if they are indicative of a likely error or misunderstanding by
the Annotator writer; this has to be balanced with making this framework "annoying" to the
user.  Kinds of notification include throwing exceptions, or (decreasing frequency) logging
of warnings.
# re: renaming Processing Actions: I never liked the term much...  I'm ok with terminal actions,
result forms, but my choice would be the combo: "terminal forms".
# re: renaming the select framework to the CAS Query framework - I think this ties too closely
to the CAS as the data source, given that other collections can be the source.  We could call
it the Feature Structure Query framework, but that seems too verbose, compared to the "select"
framework, so I'd prefer to keep "select".
# re: ordering and sorted-ordering.  I'll make a pass to clarify the subcases.  The general
approach is that sort ordering for Annotation Indexes is usually implied (but can be (partially)
undone using the unordered() builder, if desired for efficiency).

> uv3 select() api for iterators and streams over CAS contents
> ------------------------------------------------------------
>                 Key: UIMA-5115
>                 URL: https://issues.apache.org/jira/browse/UIMA-5115
>             Project: UIMA
>          Issue Type: New Feature
>          Components: Core Java Framework
>            Reporter: Marshall Schor
>            Priority: Minor
>             Fix For: 3.0.0SDKexp
> Design and implement a select() API based on uimaFIT's select, integrated well with Java
8 concepts.  Initial discussions in UIMA-1524.  Wiki with diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support

This message was sent by Atlassian JIRA

View raw message