uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl (JIRA) <...@uima.apache.org>
Subject [jira] [Commented] (UIMA-5723) MARKTABLE fails to assign feature for single word entry in first CSV column
Date Wed, 21 Feb 2018 08:51:00 GMT

    [ https://issues.apache.org/jira/browse/UIMA-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371103#comment-16371103
] 

Peter Klügl commented on UIMA-5723:
-----------------------------------

Yes, tables and primitive feature assignments are supported, but not yet dynamic reloading
of the dictionaries.

 

I have no idea yet why the type priorities could cause this. The lookup operates on RutaBasic
and there is no sequential pattern... Do you have a minimal example to reproduce it?

> MARKTABLE fails to assign feature for single word entry in first CSV column
> ---------------------------------------------------------------------------
>
>                 Key: UIMA-5723
>                 URL: https://issues.apache.org/jira/browse/UIMA-5723
>             Project: UIMA
>          Issue Type: Bug
>          Components: Ruta
>    Affects Versions: 2.6.1ruta
>            Reporter: Andreas Thiel
>            Assignee: Peter Klügl
>            Priority: Major
>
> When using Ruta's MARKTABLE action with a CSV file {{nl_law_names.csv}} like this
> {code:xml}
> WAZ;WAZELF
> Wet arbeidsongeschiktheidsverzekering zelfstandigen;WAZELF
> {code}
> and corresponding Ruta script containing these lines
> {code:java}
> WORDTABLE LawNameTable = 'nl_law_names.csv';
> Document{->MARKTABLE(WetNaam, 1, LawNameTable, "WetIdentifier" = 2)};
> {code}
> it seems that the text {{WAZ}} is detected, but the {{WetIdentifier}} feature of the
resulting annotation is not filled by the string following the semicolon. Instead, it remains
empty.
> (Note: _WetNaam_ annotation is defined elsewhere via type system description)
> In contrast, the fully written name {{Wet arbeidsongeschiktheidsverzekering zelfstandigen}}
is detected and processed as expected with feature WetIdentifier = WAZELF after annnotating.
> Could it be that problems arise when only a single word (i.e. no spaces or uppercase
letters following lowercase chars) is present in the first column in the CSV file? Or is it
a matter of configuration?
> We experimented also with the optional arguments of MARKTABLE regarding uppercase/lowercase
distinction, but to no avail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message