uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl (JIRA) <...@uima.apache.org>
Subject [jira] [Commented] (UIMA-3927) Problem with optional quantifiers and starting rule element annotation
Date Wed, 02 Jul 2014 08:14:24 GMT

    [ https://issues.apache.org/jira/browse/UIMA-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14049725#comment-14049725

Peter Klügl commented on UIMA-3927:

Ok, there are two problems. There is a bug in the question reluctant quantifier, and there
is the functionality of reluctant quantifier in general. Those do not match if they don't
have to, which is controlled by the next rule element. So in your example, there is none resulting
in no match. Actually, the second rule element should also not match at all. Is there a specific
reason why you are suing the reluctant version of the quantifier? Something like
Token?{REGEXP(Token.posTag.value, "At")} // Article
Token?{REGEXP(Token.posTag.value, "Aj")} // Adjective
@Token{REGEXP(Token.posTag.value, "No")->MARK(Chunk, 1,3)}; // Noun
should work for you since you actually want to match the optional annotations.

> Problem with optional quantifiers and starting rule element annotation
> ----------------------------------------------------------------------
>                 Key: UIMA-3927
>                 URL: https://issues.apache.org/jira/browse/UIMA-3927
>             Project: UIMA
>          Issue Type: Bug
>          Components: ruta
>    Affects Versions: 2.2.0ruta
>            Reporter: Prokopis Prokopidis
>            Assignee: Peter Klügl
> Hi,
> As the Ruta documentation mentions, "writing rules that contain a first rule element
with an optional quantifier is discouraged and will result in ignoring the optional attribute
of the quantifier." A solution for overcoming this is to declare a rule element as a starting
rule element by adding “@” directly in front of it. Thus, I am using ruta rules like
> {code}
> Token??{REGEXP(Token.posTag.value, "At")} // Article
> Token??{REGEXP(Token.posTag.value, "Aj")} // Adjective
> @Token{REGEXP(Token.posTag.value, "No")->MARK(Chunk, 1,3)}; // Noun
> {code}
> to mark nouns and optional pre-modifiers before them as chunks
> However, the rule seems to match only Adj Noun sequences and not to match input like:
> {code}
> anArt|At anAdj|Aj aNoun|No
> {code}
> Thanks for looking into this.

This message was sent by Atlassian JIRA

View raw message