ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roberto Costumero Moreno <roberto.costum...@upm.es>
Subject Re: cTAKES Translation
Date Tue, 19 Nov 2013 16:32:49 GMT
Hi Pei,

Thank you very much for your answer.

I am looking for good corpuses and thinking about a new one with my group to train the ML-based
models and I will look into the hard-coded rules in order to change them.

AFAIK, the UMLS has a subset of the terms translated into Spanish which are correlated to
the ones on the Spanish version of SNOMED CT.

I will be sharing my doubts as well as my progress here in order to get cTAKES working in
Spanish and hopefully other languages.


Roberto Costumero Moreno
Laboratorio de Minería de Datos y Simulación (MIDAS)
Centro de Tecnología Biomédica
Universidad Politecnica de Madrid
Tlf: +34 91 336 4664

El 15/11/2013, a las 14:49, Chen, Pei <Pei.Chen@childrens.harvard.edu> escribió:

> Hi Roberto,
> Welcome!  
> In theory, in order to have cTAKES work in a different language, we would just need to:
> -Retrain the existing ML-based models for the language and code should just work as is
> -Update any hard-coded rules
> -Use the Spanish dictionary for concepts (I believe UMLS already has a Spanish translation
for some of their thesauruses).
> I think it would awesome to have cTAKES work with multiple languages including Spanish!
> Actually, a lot of folks have been asking about cTAKES models in different languages.
> The challenging thing with the supervised machine learning methods is that we'll have
to rely on local domain experts to create the gold standard for training.
> There is a group that may be contributing retrained models for cTAKES to work in French.
> Others can feel free to chime in...
> --Pei
>> -----Original Message-----
>> From: Roberto Costumero Moreno [mailto:roberto.costumero@upm.es]
>> Sent: Thursday, November 14, 2013 5:43 AM
>> To: dev@ctakes.apache.org
>> Subject: cTAKES Translation
>> Hello everyone,
>> My name is Roberto Costumero and I am working for the Technical University
>> of Madrid in Spain doing my Ph.D. studies and I am new to this list, so I am
>> introducing myself and posting some doubts I have.
>> We are currently involved in a project together with several hospitals and we
>> are working closely with them into getting to know their necessities in order
>> to build an application for them to use the knowledge of their clinical notes,
>> imaging among other things.
>> We have been looking for different projects to see which one will fits our
>> needs and, of course, which will we will share our investigations with. Among
>> the different projects we have seen in the field of clinical text analysis we
>> think that cTAKES is the best one out there and it is very well structured and
>> organized, but the main problem we are facing is that every clinical text-
>> based NLP project is developed for English and we will be working with
>> Spanish texts.
>> We have already done some work for testing different algorithms translating
>> them to Spanish to detect negation and context dependency but we would
>> like to use a well-tested complete framework to work with, so we thought
>> about cTAKES, so I have a couple of questions for you.
>> - Does anyone know if someone is already working in translating cTAKES
>> modules to work with other languages (Spanish in particular)?
>> - Do you think it would be very difficult to do it because of any architectural
>> design I am not currently aware of?
>> - Do you think it would be a good line of development (for the cTAKES
>> project) to extend cTAKES to work together into translating it to Spanish in
>> this case?
>> Thank you very much in advance for your help.
>> Sincerely,
>> --
>> Roberto Costumero Moreno
>> Laboratorio de Minería de Datos y Simulación (MIDAS) Centro de Tecnología
>> Biomédica Universidad Politecnica de Madrid roberto.costumero@upm.es
>> Tlf: +34 91 336 4664

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message