uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl (JIRA) <...@uima.apache.org>
Subject [jira] [Commented] (UIMA-5723) MARKTABLE fails to assign feature for single word entry in first CSV column
Date Fri, 16 Feb 2018 13:20:00 GMT

    [ https://issues.apache.org/jira/browse/UIMA-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367288#comment-16367288
] 

Peter Klügl commented on UIMA-5723:
-----------------------------------

I was not able to reproduce the problem. I used the CSV file itself as input and both annotations
had feature values. On which input did you observe the problem? Did you change the visibility
settings?

The problem normally occurs if some sort of modification of the visibility or filtered chars
is applied during the lookup. The assignment of the features has an additional step for the
lookup where not all functionality is available. That is at least suboptimal.

I personally do not use WORDLISTs and WORDTABLEs anymore, but missed yet to contribute the
alternative to UIMA Ruta. I really should catch up on that.

 

> MARKTABLE fails to assign feature for single word entry in first CSV column
> ---------------------------------------------------------------------------
>
>                 Key: UIMA-5723
>                 URL: https://issues.apache.org/jira/browse/UIMA-5723
>             Project: UIMA
>          Issue Type: Bug
>          Components: Ruta
>    Affects Versions: 2.6.1ruta
>            Reporter: Andreas Thiel
>            Assignee: Peter Klügl
>            Priority: Major
>
> When using Ruta's MARKTABLE action with a CSV file {{nl_law_names.csv}} like this
> {code:xml}
> WAZ;WAZELF
> Wet arbeidsongeschiktheidsverzekering zelfstandigen;WAZELF
> {code}
> and corresponding Ruta script containing these lines
> {code:java}
> WORDTABLE LawNameTable = 'nl_law_names.csv';
> Document{->MARKTABLE(WetNaam, 1, LawNameTable, "WetIdentifier" = 2)};
> {code}
> it seems that the text {{WAZ}} is detected, but the {{WetIdentifier}} feature of the
resulting annotation is not filled by the string following the semicolon. Instead, it remains
empty.
> (Note: _WetNaam_ annotation is defined elsewhere via type system description)
> In contrast, the fully written name {{Wet arbeidsongeschiktheidsverzekering zelfstandigen}}
is detected and processed as expected with feature WetIdentifier = WAZELF after annnotating.
> Could it be that problems arise when only a single word (i.e. no spaces or uppercase
letters following lowercase chars) is present in the first column in the CSV file? Or is it
a matter of configuration?
> We experimented also with the optional arguments of MARKTABLE regarding uppercase/lowercase
distinction, but to no avail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message