ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dligach, Dmitriy" <Dmitriy.Dlig...@childrens.harvard.edu>
Subject head word identification
Date Mon, 02 Mar 2015 16:29:17 GMT

Is anybody aware of a reliable way of identifying the head word of a UMLS entity? In the general
domain, people often use Collins rules, but I’m not sure whether they would be applicable
to clinical entities.

Until recently I was under impression that taking the last word of an entity would work pretty
well, but now that I have looked at the data more closely, I am not so sure. E.g. it fails
in these cases: “breast, left”, “ductal carcinoma in situ”, “carcinoma, consistent
with breast primary”.


Dmitriy (Dima) Dligach, Ph.D.
Boston Children's Hospital and Harvard Medical School
(617) 651-0397

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message