uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marshall Schor (JIRA)" <...@uima.apache.org>
Subject [jira] [Commented] (UIMA-1524) JFSIndexRepository should be enhanced with new generic methods
Date Fri, 16 Sep 2016 16:34:20 GMT

    [ https://issues.apache.org/jira/browse/UIMA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15496754#comment-15496754
] 

Marshall Schor commented on UIMA-1524:
--------------------------------------

I updated the wiki with some notes.  Some thoughts:

1) The word "all" can be used to mean all types in one view, or (for some type perhaps) all
FSs from all views.
The .allViews() builder modifies things to use all views (with no index, implies unordered).
To get "all" FSs without regard to type, you leave out the type specification.
  - Note: this doesn't necessarily get all FSs without regard to type; it depends on the index
specification
    -- If no index specification, then it gets all FSs which are subtypes of TOP (i.e., all)
    -- If index spec, it gets all FSs in that index (uses the type that's included in the
index spec as the top-most type).

2) There are multiple methods to get FSs relative to a bounding begin and end, and maybe also
using type priorities.
   - covered, between, within: some of the proposed words for this, some taking 1 or 2 fs's,
or 2 ints (begin / end)
   - I know uimaFIT excludes type priorities.  I think this is more often what users want,
so it's probably good to be the "default".
     -- the builder typePriorites(), for cases where the bounds are supplied by FSs, can change
the criteria for bounding to use begin, end, and type priorities.

3) The above also applies to selection filters that bound a given begin and end, or a given
FS
  - covering, containing, taking 1 FS or 2 ints begin/end

4) the following or preceding: the official "limit" method for streams throws exception on
negative args, so I don't want to behave differently...
  - startAt, at, seek, following/preceding (I tend to like "at", and following/preceding where
that form has an additional arg used as the limit value).
  - the arg for where to start is used to specify a location in the index; if the index happens
to contain that arg as a FS then, it's part of the result; unless we want to special-case
this like uimaFIT seems to do.

5) Since Uima iterators support forward/backwards, the startAt could efficiently be augmented
with a +- offset. Variations:
  - startAt(begin, end) - start at position of left-most FS >= that begin / end
  - startAt(fs) - start at position of left-most FS >= that fs (ignoring type priorities
unless specified)
  - The same 2 with an extra int as last arg: this is the offset 
  - NOTE: the forms with begin end only work with AnnotationIndex; others work with any ordered
index

6) following /preceding: combining an "at" spec with a "limit" spec, and implying a "reverse"
spec for preceding, not sure if uimaFIT does this? or if this is a good idea?
  - jcas.select().following(3, fs);  // select the 3 following FSs >= fs, ignoring typePriorities
  - jcas.select().following(3, fs 2);  // select the 3 following FSs >= fs offset by +
2...

7) Just noting that reverse is independent of offset
    - e.g. you can have a negative offset, and traverse in a positive direction

8) "single" seems kind of awkward.  I'm thinking of just "get()" or get(arg) where the arg
is the same as used in startAt
   -  jcas.select(Token.type).get(15); 

> JFSIndexRepository should be enhanced with new generic methods
> --------------------------------------------------------------
>
>                 Key: UIMA-1524
>                 URL: https://issues.apache.org/jira/browse/UIMA-1524
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>    Affects Versions: 2.3
>            Reporter: Joern Kottmann
>
> Existing methods should be overloaded with an additional Class argument to specify the
exact return type. This changes make down casting of returned objects unnecessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message