uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Kl├╝gl <pklu...@uni-wuerzburg.de>
Subject Re: UIMA for travel emails
Date Tue, 18 Jun 2013 14:27:28 GMT
Am 18.06.2013 14:05, schrieb Rukku:
> We are new to UIMA framework.
>
> We studying UIMA to see if we can use it to parse and extract information
> from travel related emails (confirmation, cancellation). Information can be
> Passenger names, Itinarary, flight details etc. and make an XML output.
>
> We tried using UIMA and ended up using just the Regex components which we
> thought we could have use plain Java libraries to acheive the same.
>
> Any help in giving us some direction will be greatly appreciated.

A solution for this task depends (in my opinion) mainly on the 
properties of the input and if there is labeled data. It's rather not a 
question of architecture.

Some (incomplete) thoughts about UIMA-based approaches:
- You could train a CRF or something similar with ClearTK [1] if you 
have enough labeled data.
- For simple NER, there are some models provided by DKPro [2].
- If you want to define some rules or patterns, then there is UIMA Ruta 
(Rule-based Text Annotation) [3].

Best,

Peter

[1] https://code.google.com/p/cleartk/
[2] 
https://docs.google.com/spreadsheet/pub?key=0ApGcdapz0xSYdGh2azY2ODMtZDRNczUySEZJUFpXM2c&single=true&gid=0&output=html
[3] http://uima.apache.org/ruta.html


> Regards,


Mime
View raw message