uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject How to properly update offsets of an annotation?
Date Wed, 13 Aug 2014 09:33:42 GMT
Hi all,

I am facing a very odd situation with the following type of (pseudo-)code:

def previousToken;
def toDelete[];
for (def token : select(jcas, Token)) {
  if (previousToken && isName(previousToken, token) {
    token.setBegin(previousToken.getBegin());
    toDelete.add(previousToken);
  }
  previousToken = token;
}

for (def token : toDelete) {
  token.removeFromIndexes();
}

Depending on the text in the CAS, sometimes I get
the effect that the tokens in toDelete actually remain
in the CAS.

I tried a different approach in which I also record the
tokens with the updated start index and then do a

for (def token : toReindex) {
  token.removeFromIndexes();
  token.addToIndexes();
}

That seems to flip around the situation. If a token was
previously correctly removed, it now remains, and if a
token was not removed, it is removed now.

I would like to avoid having to create a new token annotation
with new offsets and then delete both the old annotations.

If need be, I can probably set up a minimal test case, but
before that, maybe somebody could give me a clue...

Cheers!

-- Richard

Mime
View raw message