uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andreas Thiel (JIRA)" <...@uima.apache.org>
Subject [jira] [Created] (UIMA-5723) MARKTABLE fails to assign feature for single word entry in first CSV column
Date Fri, 09 Feb 2018 13:13:00 GMT
Andreas Thiel created UIMA-5723:

             Summary: MARKTABLE fails to assign feature for single word entry in first CSV
                 Key: UIMA-5723
                 URL: https://issues.apache.org/jira/browse/UIMA-5723
             Project: UIMA
          Issue Type: Bug
          Components: Ruta
    Affects Versions: 2.6.1ruta
            Reporter: Andreas Thiel

When using Ruta's MARKTABLE action with a CSV file {{nl_law_names.csv}} like this
Wet arbeidsongeschiktheidsverzekering zelfstandigen;WAZELF
and corresponding Ruta script containing these lines
WORDTABLE LawNameTable = 'nl_law_names.csv';
Document{->MARKTABLE(WetNaam, 1, LawNameTable, "WetIdentifier" = 2)};
it seems that the text {{WAZ}} is detected, but the {{WetIdentifier}} feature of the resulting
annotation is not filled by the string following the semicolon. Instead, it remains empty.

(Note: _WetNaam_ annotation is defined elsewhere via type system description)

In contrast, the fully written name {{Wet arbeidsongeschiktheidsverzekering zelfstandigen}}
is detected and processed as expected with feature WetIdentifier = WAZELF after annnotating.

Could it be that problems arise when only a single word (i.e. no spaces or uppercase letters
following lowercase chars) is present in the first column in the CSV file? Or is it a matter
of configuration?

We experimented also with the optional arguments of MARKTABLE regarding uppercase/lowercase
distinction, but to no avail.

This message was sent by Atlassian JIRA

View raw message