uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: question about uimaFIT impl of select following / preceding
Date Sat, 01 Oct 2016 14:00:50 GMT
ok, I'll default it that way.

I see that this mode (strict, or endWithinBounds) was also used for the
"between" and "covered" impl, so I'll make that the implied default when these
are specified.

-Marshall


On 9/30/2016 5:48 PM, Richard Eckart de Castilho wrote:
> The idea was to obtain annotations that precede the "pos" annotation
> in terms of offsets. So if you superimpose the "pos" annotation
> and another candidate annotation "c" on the text, then "c" and "pos"
> should not overlap. 
>
> I think the annotation so far has been mostly (probably exclusively) used
> on annotations that to not overlap (e.g. Token). I think selectPreceding 
> in uimaFIT has a bug here - as you point out - that while iterating
> backwards, the end position of the annotations are not checked before
> including them in the result set.
>
> Cheers,
>
> -- Richard
>
>> On 30.09.2016, at 21:38, Marshall Schor <msa@schor.com> wrote:
>>
>> The implementation appears to take a positioning element, (let's call it "pos"),
>> and then start at an annotation where
>>
>>  * for following: the begin is > pos.end
>>  * for preceding: the end is < pos.begin
>>
>> The Annotation index guarantees that "begin" positions are in ascending order as
>> you go thru the index.
>>
>> But there's no such guarantee for "end" positions.  They can jump back and
>> forth.  So the logic for preceding: starting with some annotation whose end is <
>> pos.begin in no way guarantees that all the found annotations will have this
>> property.
>>
>> What was the design intent for preceding?
>>
>> If it was to get FSs that precede the pos in the index, then this logic is not
>> doing that - it's initially skipping some number of preceding items (where the
>> preceding FSs have ends >= pos.begin), and then providing results where the end
>> could easily bounce around the pos.begin position.
>>
>> -Marshall
>>
>


Mime
View raw message