ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Savova, Guergana" <Guergana.Sav...@childrens.harvard.edu>
Subject RE: Phenotype-specific entities
Date Wed, 15 Feb 2017 18:45:35 GMT
Hi Erin,
Yes, creating your customized dictionary is the way to go. You can prune by semantic types
of interest and then remove branches that are not relevant to your specific phenotype. I am
not aware of cTAKES implementing such a tool for a very customized dictionary.

You can also start with  a few terms that you know are relevant to your phenotype and then
find their synonyms in the UMLS. Then, you can further walk a specific ontology and take siblings,
parents if you think they are relevant.

Then, there is the whole field of using word embeddings to find synonyms/related terms from
unlabeled data  if you want to become really fancy :-) At this point, cTAKES does not implement
any deep learning algorithms, in the future we are planning to release a bridge to KERAS.


I hope this makes sense.

--
Guergana Savova, PhD, FACMI
Associate Professor
PI Natural Language Processing Lab
Boston Children's Hospital and Harvard Medical School
300 Longwood Avenue
Mailstop: BCH3092
Enders 144.1
Boston, MA 02115
Tel: (617) 919-2972
Fax: (617) 730-0817
Guergana.Savova@childrens.harvard.edu
Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv
ctakes.apache.org
thyme.healthnlp.org
cancer.healthnlp.org
share.healthnlp.org


-----Original Message-----
From: Erin Nicole Gustafson [mailto:erin.gustafson@northwestern.edu] 
Sent: Wednesday, February 15, 2017 1:38 PM
To: dev@ctakes.apache.org
Subject: Phenotype-specific entities

Hi all,

I would like to be able to only identify entities that are relevant for some specific phenotype.
One step towards achieving this would be to build a custom dictionary with a limited set of
semantic types. However, this is not quite specific enough to only identify mentions related
to one disease while ignoring those related to some other disease, for example.

Does cTAKES currently have a way to do this sort of filtering? Or, has anyone developed their
own tools that they'd be willing to share?

Thanks,
Erin

Mime
View raw message