uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marshall Schor (JIRA)" <...@uima.apache.org>
Subject [jira] [Commented] (UIMA-1524) JFSIndexRepository should be enhanced with new generic methods
Date Mon, 19 Sep 2016 15:23:20 GMT

    [ https://issues.apache.org/jira/browse/UIMA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15503756#comment-15503756
] 

Marshall Schor commented on UIMA-1524:
--------------------------------------

Thanks, Richard, for your thoughtful dialog on these topics.  It's nice to find someone else
interested in the arcana of principled API design :-).  

1) re: Fluctuating between positional-style and keyword style: the keyword style is the most
"factored" style - lets one build what you want with a "minimal" set of keys.  So it's useful
to list these as separate things (as done in the diagram), to see the totality of what one
might reasonably "build" out of a minimal set of primitives, and what those primitives might
be.

But the keyword style is verbose, so it seems good design to support positional for frequently-used
idioms, for more concise/compact code.

2) Yes, your understanding of the Gliffy diagram is correct re: the object without a "terminal"
is something supporting special methods for UIMA and also the stream interface.  My thought
is that when you use a stream method, that will result in the creation of a Stream object,
based on everything you've specified, up to that point, and from then on, it will act like
a stream.

This "automatic" conversion to stream is a "convenience"; we could dispense with it, and require
the user to explicitly write ....x().y().stream().a().b().   where x, y are the uima specific
methods, and a, b are stream methods.  It seems better (more concise/compact) to leave out
the stream(), though.  Maybe it could be optional, and the documentation (for learning) could
start with it explicitly present, and only later drop it; this could demystify the API slightly.


3) It seems right that having more than one "type" spec is more likely an error, than a useful
feature, so I agree it would be more helpful to throw an exception if that happened.

4) Because "type" is used frequently, I think it's a good candidate to allow it as a positional
argument.

5) When moving keyword arguments into positional ones, should we drop the keyword form?  I'm
a bit on the fence.  Sometimes I think users prefer an API style which lists arguments with
their keys (as in the form type(aTypeSpec), uniformly, even if there are positional forms.
 I don't think there's a cost to keeping this option, so I'm on the side of keeping it.

**Decision requirement::** dropping the *type* call - I'd say to keep it; throwing an exception
if called twice: I'd say yes (because it's more likely an error than a wanted feature).

6) multiple location requirements.  I agree it would be nice, but it might be difficult to
(efficiently) implement.   People might write forms like within(span1).within(span2), where
the spans had some overlap.  It seems we'd have to go through and figure out the semantics
for lots of combinations of method (keyword) chains...  So maybe we could not do this in version
1, and later figure out some often requested instances of this as a version 2 improvement.

These are perhaps "tricky".  In the example of .within(sentence).following(predicateVerb),
even though this sounds sensible, I think it's somewhat misleading.  The following(predicateVerb)
completely specifies the position, and the .within(sentence) just serves as a boolean switch
- is the predicateVerb location is within(sentence),   I think this is a confusing way to
represent this versus something like:
{code}
if (sentence.contains(predicateVerb))
{code} 
or something similar that does boolean operations on spans (with or without typePriorities).

**Decision requirement:** allowing multiple location specs - I would severely restrict this
in version 1 to just those easily supported by the current indexes/iterators, which in my
mind would be just one of the covered/covering/moveTo(fs) kind, and one displacement kind;
the semantics would be that the displacement operation follows the other.

7) I like "displacement" instead of "offset", I'd like even better a shorter word :-) ; some
possibilities:  shift, shifted, shiftedBy, move, moved, movedBy  (the "ed" forms are more
declarative, but sound a bit clumsy to me).  Of these, my choice is "shifted":  e.g.   cas.select(Token.type).within(fs).shifted(3),
but not a strong preference.

8) Semantics of displacement: I had not thought of the complexity of having it operate with
respect to some other index/type; I thought of it as moving the point where you start by iterating
moveToNext or moveToPrevious.  Our current mechanisms support this trivially, I think.  Also,
in the form "cas.select(Token.type).following(predicateVerb)", the "type" of predicateVerb
(surprised) need not even be a subtype of Token.  

**Decision requirement::** displacement meaning: I'd prefer making it just an interated MoveToNext
or Previous.

9) Re: supporting enhanced for loop: I agree it would be good to support this.  I think the
implementation could do this, so your example would work. It would switch to the "stream"
alternative only if some stream operation was specified.

> JFSIndexRepository should be enhanced with new generic methods
> --------------------------------------------------------------
>
>                 Key: UIMA-1524
>                 URL: https://issues.apache.org/jira/browse/UIMA-1524
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>    Affects Versions: 2.3
>            Reporter: Joern Kottmann
>
> Existing methods should be overloaded with an additional Class argument to specify the
exact return type. This changes make down casting of returned objects unnecessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message