uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oleg Fedoriaka (JIRA)" <...@uima.apache.org>
Subject [jira] [Created] (UIMA-4453) MARKTABLE action works improperly
Date Wed, 10 Jun 2015 07:47:00 GMT
Oleg Fedoriaka created UIMA-4453:
------------------------------------

             Summary: MARKTABLE action works improperly
                 Key: UIMA-4453
                 URL: https://issues.apache.org/jira/browse/UIMA-4453
             Project: UIMA
          Issue Type: Bug
          Components: ruta
    Affects Versions: 2.3.0ruta
         Environment: OS X 10.9.1, Java v8u45, Eclipse Luna
Windows 7, Java v8u45, Eclipse Luna
            Reporter: Oleg Fedoriaka
             Fix For: 2.2.0ruta


New available UIMA Ruta Runtime & Workbench 2.3 for Eclipse has lost proper functionality
of MARKTABLE action. 

This action stopped annotating of all words from a csv file. I had noticed that the problem
happened only for 

words written in Cyrillic witch contains spaces, i.e. for Latin it works fine. Please use
sample outlined below 

in order to reproduce the problem i'm talking about.

# script/main.ruta
WORDTABLE Dict = 'dict.csv';
DECLARE Annotation Test (STRING meaning);
Document {-> MARKTABLE(Test,1,Dict, "meaning" = 2)};

# resources/dict.csv
від;from
с какой стати;why
с которой;fromWhich
сюда;here
по какому;which
сюди;here
как нибудь;somehow
сколько;howMuch

# input/test.txt
від с какой стати с которой сюда по какому сюди как
нибудь сколько

After main.ruta script execution we wont get annotated everything from test.txt Worth mentioning
that Cyrillic 

Cyrillic letter like 'с' at the beginning of string, somehow affecting on processing behavior.
Moreover, by 

removing lines with spaces, will get rid us from the issue described above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message