ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <Pei.C...@childrens.harvard.edu>
Subject RE: Recommendation for ctakes default (UMLS) dictionaries
Date Wed, 10 Sep 2014 17:55:15 GMT
That would be pretty cool.
Currently, there are all of pre-built ctakes dictionaries in maven central- we can add more
as there are more contributions:

Agree that it would be nice if there was an apt-get or similar install that downloads and
unpacks for each use case...

> -----Original Message-----
> From: andy mcmurry [mailto:mcmurry.andy@gmail.com]
> Sent: Tuesday, September 09, 2014 5:33 PM
> To: ctakes-dev@incubator.apache.org
> Subject: Recommendation for ctakes default (UMLS) dictionaries
> Greetings ctakes-dev:
> *UMLS license restrictions have been getting more lax over the years --
> *much of the UMLS can be downloaded directly from the NCBI official FTP
> site.
> In fact, the NIH (and implicitly the NLM) *have already made the standard
> terms public for some medical specialities*.
> For example: Here is the UMLS subset specific to Medical Genetics
> (MedGen) and Genetic Testing (GTR) complete with SNOMED-CT concept
> CUI(s) and names, etc :
> [  ftp://ftp.ncbi.nlm.nih.gov/pub/medgen/README.html  ]
> My team has developed a JVM based wrapper for MetaMap 2013AB which I
> intend to open source soon (Clojure).  It includes REST support for invoking
> MetaMap with any or all of the command line arguments.
> We do not integrate with UIMA, we are basically a wrapper around the
> binary installation of MetaMap. The emphasis is on publication text not
> clinical text, still, some services are common (such as LVG).
> Strangely, the NLM still requires UMLS licenses to download MetaMap
> execution binaries. The MetaMap binary install is better but customizing
> dictionaries (DataFileBuilder) is not as easy to use as CTAKES with YTEXT
> [ https://cwiki.apache.org/confluence/display/CTAKES/YTEX+Installation ]
> *** Hence, there is a real opportunity here to enable Apache cTAKES to have
> a stronger default dictionary. ** *
> Imagine if we could
> *$ apt-get install apache-ctakes *
> and instantly have a working package for SOME problem domain.
> In my case (Medical Genetics) the UMLS definitions are already available and
> the UMLS license problem becomes a non issue, at least for many first time
> users
> Your thoughts?
> AndyMC
View raw message