ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject RE: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor
Date Wed, 04 Sep 2013 13:49:42 GMT
This may sound strange, but SNOMED does not contain the term "CIN I".  It contains the terms
"CIN I - Cervical intraepitheal neoplasia 1" and "CIN I - mild dyskaryosis".  

-----Original Message-----
From: Pei Chen [mailto:chenpei@apache.org] 
Sent: Tuesday, September 03, 2013 10:13 PM
To: dev@ctakes.apache.org
Subject: Re: specificity in selecting EntityMentions when using AggregatePlaintextUMLSProcessor

You're right, it should have gotten "CIN I"- that's a strange one, probably needs to be debugged/looked
into further...

On Tue, Sep 3, 2013 at 10:05 PM, Miller, Timothy <Timothy.Miller@childrens.harvard.edu>
> Ah. So it will get
> CIN 2 (in SNOMED)
> CIN 3 (in SNOMED)
> but the rest are not in SNOMED?
> I wonder why it doesn't get CIN I? It looks like that exists in SNOMED 
> (though I don't fully understand what all the symbols mean in the umls 
> browser).
>> CIN I - Cervical intraepithelial neoplasia 1 
>> [A3002690/SNOMEDCT/SY/285836003]
> On 09/03/2013 09:55 PM, Pei Chen wrote:
>> It has the correct parse (POS, chunks, and lookupwindow)- but some of 
>> the terms do not exist in SNOMED- CIN 2 - Cervical intraepithelial 
>> neoplasia 2 [A3002688/SNOMEDCT/SY/285838002] exists but not CIN II.
>> CIN III [A3333965/SNOMEDCT/SY/20365006] also exists that's why it was 
>> able to perform the lookup successfully.
>> Note that CIN II synonyms do exist in other umls thersauses such as 
>> MEDCIN, CCPSS though.  However, the bundled cTAKES dictionaries only 
>> contain (MeSH, SNOMEDCT, RxNORM, NCI, ICD9) IRRC.
>> --Pei
>> On Tue, Sep 3, 2013 at 9:44 PM, Miller, Timothy 
>> <Timothy.Miller@childrens.harvard.edu> wrote:
>>> That is a good question, Ted!
>>> I tried it with a simple context: "The patient has a CIN III." I'm 
>>> not sure if that is a correct context but I was able to duplicate 
>>> your findings. (Finds a CUI for CIN III but not if you change it to 
>>> CIN II)
>>> My first thought was that it is the chunker. But the chunker seems 
>>> to get it right, as CIN II and CIN III are both called NPs, and 
>>> similarly the LookupWindowAnnotator handles them both identically. 
>>> So that suggests it is a problem with the actual lookup of the 
>>> tokens in the LookupWindow.
>>> That's all I can do for now but maybe someone else who knows more 
>>> about its behavior offhand will have an idea.
>>> Tim
>>> On 09/03/2013 08:24 PM, Assur, Ted wrote:
>>>> I'm trying to understand what would prevent the AggregatePlaintextUMLSProcessor
AE from correctly parsing specific problems that are defined in the UMLS version used by cTAKES.
>>>> For example,
>>>> CIN (Cervical Intraepithelial Neoplasia) in its general usage is parsed out
as UMLS CUI C0206708.
>>>> CIN comes in 3 grades, 1, 2 and 3. Sometimes this is reported with Roman
Numerals, I,II, and III.
>>>> cTAKES correctly identifies "CIN 3" and "CIN III" with UMLS CUI C0851140:
"Carcinoma in situ of uterine cervix."
>>>> However, I cannot get it to recognize CIN 1, CIN I, CIN 2, or CIN II as their
correct concepts, "Cervical intraepithelial neoplasia grade 1" and "Cervical intraepithelial
neoplasia grade 2" respectively.
>>>> Is there a way to tune the detection of UMLS concepts?
>>>> --------------------------------------------
>>>> Ted Assur
>>>> IT Solutions Architect for Cancer Research Providence Health & 
>>>> Services ted.assur@providence.org
>>>> 503-215-6476
>>>> Crede, ut intelligas.
>>>> Intellego, ut credam.
>>>>   ________________________________
>>>> This message is intended for the sole use of the addressee, and may contain
information that is privileged, confidential and exempt from disclosure under applicable law.
If you are not the addressee you are hereby notified that you may not use, copy, disclose,
or distribute to anyone the message or any information contained in the message. If you have
received this message in error, please immediately advise the sender by reply email and delete
this message.

View raw message