ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject Re: Common Type System across systems?
Date Wed, 02 Oct 2013 16:07:36 GMT
Thanks for the reference, I'll have a look at it.

I don't plan to invent the ultimate type system :P Of course that would be
doomed to fail. I also don't plan to venture into the design of the special
medical types that cTAKES needs in addition.

I plan to make suggestions for the basic analysis levels (e.g. sentence,
token) and possibly work up from there into some of the lower linguistic 
analysis levels, as well as to suggest general design patterns. There are
also some ideas how to handle adoption to reduce changes to code to a minimum.

I think there is some realistic potential. But let's see how far this can be
pushed… if anywhere at all :) Maybe I'm too optimistic :P

-- Richard

On 02.10.2013, at 17:53, "Wu, Stephen T., Ph.D." <Wu.Stephen@mayo.edu> wrote:

> Richard, it'd be great if you are able to put dedicated effort to it,
> i.e., take the lead for (1) below!
> Unfortunately, in our experience, you still need a lot of people and their
> time to be involved in (2), which often requires funding, and as mentioned
> in (2a) if it is not binding then people will be unlikely to adopt.  Maybe
> I'm overly pessimistic?
> One specific portion of the cTAKES type system is that we make separate
> types for the clinical semantic groups.  The referential semantics portion
> of the type system was the main focus of our efforts (see reference below)
> due to its importance in the medical domain.  This is quite different than
> semantic structures, e.g., Discourse Representation Theory.  Richard, I'm
> interested in how you'd view the differences as someone who wasn't
> involved in their creation.
> I think we made plenty of mistakes that make life difficult for people at
> a practical level, since we were designing it not necessarily even tied to
> UIMA.  But hopefully with your additional work it will be really good!
> Anyways good luck! =P
> stephen
> * Wu, Stephen T, Vinod C Kaggal, Dmitriy Dligach, James J Masanz, Pei
> Chen, Lee Becker, Wendy W Chapman, Guergana K Savova, Hongfang Liu,
> Christopher G Chute. A common type system for clinical natural language
> processing <http://www.jbiomedsem.com/content/4/1/1>. J Biomed Sem. 4:1.
> 2013.
> On 10/1/13 2:53 PM, "Karthik Sarma" <ksarma@ksarma.com> wrote:
>> This seems like a *very* challenging and involved problem to me...
>> On Tuesday, October 1, 2013, Pei Chen wrote:
>>> Agreed.
>>> Yes, I think this is slight augmentation and extension of the original
>>> vision of the clinical common type system- by having it work with other
>>> UIMA based NLP system.  Having worked on item (3) for cTAKES, I actually
>>> think the tough part will be getting consensus and agreement on a system
>>> between all parties and less on the required code changes.  Hence, just
>>> wanted to ping the community to gauge interest and see if this actually
>>> makes sense [It would be nice to plug in different POSTaggers or example
>>> without having to remap types].
>>> If we have a willing volunteer (Richard :)?) to perform some of the
>>> prelim
>>> analysis Q1 2014 with our existing type system, perhaps we can actually
>>> make this happen.
>>> 4a) I think the SHARP4 development group has essentially moved to the
>>> cTAKES ASF community which is probably even better since it already has
>>> a
>>> meritocratic/governance mechanism to handle changes.
>>> On Tue, Oct 1, 2013 at 10:39 AM, Wu, Stephen T., Ph.D.
>>> <Wu.Stephen@mayo.edu <javascript:;>>wrote:
>>>> Pei et al,
>>>> That was the vision for the SHARP "common type system", except it was
>>>> meant to include medical-related projects rather than general
>>> projects.
>>>> Steve's process below is probably the most realistic way to do things,
>>> and
>>>> it's basically how we did the current cTAKES type system.
>>> Unfortunately,
>>>> the "someone" doing #1 was me, and I didn't realize that it would be
>>> quite
>>>> difficult.  I guess I know more about how to do it now but #1 and #2
>>> were
>>>> surprisingly harder than I expected.  I'm adding a #4:
>>>> (1) Have someone inspect the various type systems closely and make a
>>>> proposal
>>>>  A. Know each of the type systems on their own.  Essential to
>>> visualize
>>>> them appropriately, but it is still difficult to understand the
>>>> implications of type changes just by looking. (By the way, we never
>>> came
>>>> up with a really great automatic visualization tool, closest was a
>>> Protégé
>>>> plugin. Excellent visualization would go a long way, especially if
>>> edits
>>>> were possible.)
>>>>  B. Categorize portions of type systems to compare and take them a
>>> step
>>>> at a time.
>>>>  C. Clearly limit which type systems you are going to consider for
>>> your
>>>> comparison and reconciliation.
>>>>  D. Pick a starting point.  I found it nearly impossible to create
>>> from
>>>> scratch when you're staring at 4-5 other type systems.  We started
>>> from
>>>> the old cTAKES type system but that did cause some bias!
>>>>  E. Develop real criteria (or at least opinions) for choosing between
>>> the
>>>> many options.
>>>> (2) Agree on the proposal.
>>>>  A. Multiple projects should make a binding agreement to implement.
>>> This
>>>> means, most likely, that they somebody needs to have assurance of
>>> funding.
>>>> In our case, we only made it binding for cTAKES, so it is only used
>>> by
>>>> cTAKES (as far as I know).
>>>>  B. With different projects' vested interests on the line, have some
>>> real
>>>> discussions of what your project is going to give up with the proposed
>>>> stuff.
>>>> (3) Spend the time to re-write all the code to use the new type
>>> system.
>>>>  * As Steve said, this is time-consuming, especially if things get
>>> broken
>>>> and models need to be retrained, etc.
>>>> (4) Ensure maintenance and modifiability across projects.
>>>>  A. The original SHARP common type system vision handed off the
>>>> maintenance to the Software Development Group, but that never really
>>>> happened. I hope the Apache community can serve as this to some
>>> degree,
>>>> but so far it has still depended on unreliable people like myself.
>>>>  B. A means of having everyone automatically draw from the same
>>> source
>>>> code would be preferable.
>>>>  C. If, in the future, you need to consider another UIMA project
>>> whose
>>>> type system should be reconciled... Well, that's happening right now.
>>> I
>>>> guess you can worry about it when you get there if you have a
>>> community
>>>> that's willing to deal with it.
>>>> Those are just some thoughts.  It's not impossible, but neither is it
>>>> simple.
>>>> stephen
>>>> On 9/30/13 8:17 PM, "Steven Bethard" <steven.bethard@gmail.com> wrote:
>>>>> We (ClearTK) talked with Richard (DKPro) about doing this for ClearTK
>>>>> and DKPro. Basically, both groups were all for it, but the main issue
>>>>> was time. Basically you need to:
>>>>> (1) Have someone inspect the various type systems closely and make a
>>>>> proposal
>>>>> (2) Agree on the proposal.
>>>>> (3) Spend the time to re-write all the code to use the new type
>>> system.
>>>>> Step (3) is especially time consuming, but in fact, we never managed
>>>>> to get the free time for step (1).
>>>>> That all said, ClearTK would love to share a common type system with
>>>>> other projects.
>>>>> Steve
>>>>> On Mon, Sep 30, 2013 at 7:38 PM, Pei Chen <chenpei@apache.org>
>>>>>> Richard, I, and few others had an interesting bar conversation...
>>>>>> In the spirit of interoperability, What if we had a baseline common
>>> type
>>>>>> system that could be reused across UIMA compatible NLP systems?
>>>>>> Imagine for a moment that OpenNLP, Clea
>> -- 
>> --
>> Karthik Sarma
>> UCLA Medical Scientist Training Program Class of 20??
>> Member, UCLA Medical Imaging & Informatics Lab
>> Member, CA Delegation to the House of Delegates of the American Medical
>> Association
>> ksarma@ksarma.com
>> gchat: ksarma@gmail.com
>> linkedin: www.linkedin.com/in/ksarma

View raw message