ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manuel Lamy <mmvp...@gmail.com>
Subject Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
Date Thu, 25 Jan 2018 19:28:00 GMT
Hello Sean,

Before all, thansk a lot for the quick and detailed answer. Awesome support
by you.

I'll give you a structured answer to be the more objective and concise
possible. I guess it's important to tell you what I'm trying to achieve in
order for you to help me.

*My Project*

I'm actually making a project with cTAKES in a partnership with a
Portuguese hospital.

My goal is to create reports of the narrative parts of the EMRs of this
hospital, in order to report the symptoms, diseases and clinical procedures
found in each EMR.

What I have in mind is to create a pipeline system that first translates
the texts from Portuguese to English, and then creates these reports based
on the translated texts.

I'm not even sure yet I can create a pipeline system of this style with
cTAKES. I need to do it because cTAKES seems to not work with the
Portuguese language at all (despite that option being shown in the
languages list when using CVD and that's confusing). So, well, I will
translate it, I guess it's my best bet.

But just a note, I think it should exist more support and documentation
about how to work with cTAKES in different languages than English. From my
research, I couldn't find anything relevant in this topic. Not even one
reference telling clearly that cTAKES only works with English language and
not with the others.

*Version of cTAKES*

Naturally, I'm running the development version of cTAKES. I'm using
Intellij. I'm using the latest version of cTAKES, trunk, that corresponds
to version 4.0.1-SNAPSHOT.

So, I guess so far so good, just as you said, I'm using trunk.

I did everything as per the guide "Developer Install Guide", concerning the
Intellij instructions. The guide I used can be found here:

*Behavior of cTAKES when running pipelines*

Well, I did what you told me. I ran the Default Clinical Pipeline and the
Piper File Submitter as per the wiki's. I have the User and Development
versions both in my machine.

Now, I tried to run those pipelines in the User and Development versions. I
ran the respective bat files:

   - For the Default Clinical Pipeline I ran 'bin/*runClinicalPipeline * -i
   *inputDirectory*  --xmiOut *outputDirectory*  --user *umlsUsername*
   --pass *umlsPassword'*
   - For the Piper File Submitter, I ran the 'bin/*runPiperSubmitter'*

Well, the results of running these two bat files were quite differents for
the User and Development versions.

*User Version*

*Default Clinical Pipeline*

In this version, I went to bin directory and just ran the line 'bin/
*runClinicalPipeline * -i *inputDirectory*  --xmiOut *outputDirectory*
--user *umlsUsername*  --pass *umlsPassword' *with my parameters.

It worked well and created the XMI output files where it was supposed. And
I could open them in CVD, first opening a TypeSystem.xml file and then the
generated XMI files I wanted.

*Piper File Submitter*

Well, since this is the user version, I don't have
the runPiperSubmitter.bat available. Is this normal? That's comprehensible
and I guess normal, for what I understand from this quote " If you are
running from a development environment (checked out trunk from SVN) they
can also be run using the Piper File Submitter GUI." But you tell me.

Well, I can say the User Version did what I wanted in this step, but I
thought that would be nice to replicate it in the Development version,
since I guess I'll have to use it in the future in order to implement all I
want for my project described in the beggining of this e-mail. And the
problems arose in the Development version....

*Development Version*

Well, in this version, I tried to replicate what I did in the User version,
thinking to myself it would output the same result. I was wrong.

*Default Clinical Pipeline and Piper File Submitter*

When I try to run the bat files inside the bin of the Dev Version, I have
the results shown in the image attached to this e-mail.

Yes, could not find or load PiperFileRunner and PiperRunnerGui. Is it
supposed to happen in the Development Version? Am I doing something wrong
in here? i just followed the guides you have available. All my Development
Version installation was per the guide.

*My objective with this e-mail*

Well, first of all, my objective is to share my experiences with cTAKES, in
order to share with the community what I'm going through. This way I can
contribute to the community and probably help others who are going through
the same as me.

In second place, I would like to know your opinion about the feasability of
what I'm trying to make here. My goal is build a pipeline system like:

   - EMRs in Portuguese already in txt files in a directory -> Translation
   to English -> Process all of the texts with Clinical Pipeline -> Output XMI
   in order to open them in CVD

This is what I aim with cTAKES. So I have the following questions:

   1. Is this feasible? Am I aiming for something that I simply can't rely
   in cTAKES only to do, because I have to translate the texts first?
   2. Why don't I have a TypeSystem.xml file to feed CVD first, in the
   Development Version? I can only find it in the User Version, under
   3. Why do we have options in CVD for other languages, but it clearly
   only works for the English language?
   4. Any other hint you can give me, concerning the big picture of what
   I'm trying to build here?

Any additional information you need from my side, just tell me.

Thanks one more time for the quick answers and support Sean.

Best regards,


2018-01-25 15:35 GMT+00:00 Finan, Sean <Sean.Finan@childrens.harvard.edu>:

> Hi Manuel,
> My first comment is that you are running ctakes in a somewhat “ancient”
> manner, or better put, the xml descriptor workflow has been pretty much
> deprecated.
> You should try to run ctakes 4.0.  If you are software savvy then I advise
> that you try the development version that is in trunk.  You’ve probably
> been on the ctakes download page, but just a reminder :
> http://ctakes.apache.org/
> The ctakes wiki has some useful information, and the 4.0 entry is here:
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+4.0
> To start playing with ctakes I suggest that you try to run the default
> clinical pipeline, following the instructions here:
> https://cwiki.apache.org/confluence/display/CTAKES/Default+
> Clinical+Pipeline
> Those instructions will start the default clinical pipeline from a command
> line.  If you have the development version from trunk then there is a gui
> available to run pipelines:
> https://cwiki.apache.org/confluence/display/CTAKES/Piper+
> File+Submitter+GUI
> There are also many other pipeline configurations available in trunk to
> run more advanced / involved pipelines.  They are not in the 4.0 release.
> The pipelines (including 4.0 default) are all defined using the replacement
> for those xml descriptor files.  The replacements are called “piper files”.
> https://cwiki.apache.org/confluence/display/CTAKES/Piper+Files
> I hope that you find the pipers easier to understand and use than the old
> xml descriptors.
> Anyway, if you run the ctakes 4.0 default clinical pipeline as outlined in
> the wiki page it will use the new FileTreeReader and FileTreeXmiWriter
> combination.
> Give it a whirl and let me know how things go.
> Sean
> From: Manuel Lamy [mailto:mmvpdml@gmail.com]
> Sent: Thursday, January 25, 2018 9:09 AM
> To: dev@ctakes.apache.org
> Subject: Re: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
> Hello Sean,
> First of all, thanks for your quick answer.
> I'm probably making some confusion over here, so I have the following
> questions.
>   1.  A CAS Consumer is defined by a XML file. What you are implying is
> that I should go to my consumer XML (__XmiWriterCasConsumer.xml) and change
> it's <implementationName> tag to 'org.apache.ctakes.core.cc.FileTreeXmiWriter'
> instead of 'org.apache.ctakes.core.cc.XmiWriterCasConsumer'? Funny
> enough, it gives me a classNotFoundException if I do this. Would like to
> have your confirmation if I'm doing the right thing please. The class is
> well defined in that path though.
>   2.  Concerning the reader, I make the same analogy. Should I go to my
> descriptor and change it's <implementationName> tag from '
> org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader' to '
> org.apache.ctakes.core.cr.FileTreeReader'?
> I did these two things and the error is the same concerning the new
> consumer 'FileTreeXmiWriter', as you can see in the first image attached to
> this e-mail.
> I would also like to ask you another question:
>        3. Why does my class 'FileTreeXmiWriter' has a lot of unresolved
> classes? You can see it in the second image attached to this e-mail. I
> can't seem to import them right. I tried to import the extension of this
> class only to check the result, and look how it solved the import to me.
> 'apache' is not recognized. I'm just kinda baffled with the hierarchy
> defined for this project. If you could give me a little bit of
> clarification in this topic and how to solve it I would be appreciated.
> Thanks for your attention! I'm really looking forward to put this to work.
> cTAKES seems awesome. It just needs these little tweaks.
> Best regards,
> Manuel
> 2018-01-24 22:26 GMT+00:00 Finan, Sean <Sean.Finan@childrens.harvard.edu
> <mailto:Sean.Finan@childrens.harvard.edu>>:
> Hi Manuel,
> Your image got scrubbed by a server, but the problem may have been fixed
> in a recent xmi writer.  The latest xmi writer is in ctakes core and is
> named FileTreeXmiWriter.  One possible cause for a problem in the writer is
> if the document has some unexpected character or character combination.  A
> document reader should be massaging documents before they are processed and
> sent to the writer.  The most recent file reader is named FileTreeReader
> and is also in ctakes core.
> Sean
> From: Manuel Lamy [mailto:mmvpdml@gmail.com<mailto:mmvpdml@gmail.com>]
> Sent: Wednesday, January 24, 2018 5:10 PM
> To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
> Subject: Problem using CPE and XMI Writer CAS Consumer [EXTERNAL]
> Hello guys,
> I'm having problems running the CPE using a XMI Writer CAS Consumer.
> However, it works with other consumers.
> Problem
> In the figure below, you can see my setup and the error I'm obtaining:
> [Imagem inline 2]
> Logs
> Concerning logs, I'm obtaining this from Intellij:
> org.apache.uima.resource.ResourceInitializationException
>             at org.apache.uima.collection.imp
> l.CollectionProcessingEngine_impl.initialize(CollectionProce
> ssingEngine_impl.java:81)
>             at org.apache.uima.impl.UIMAFrame
> work_impl._produceCollectionProcessingEngine(UIMAFramework_impl.java:438)
>             at org.apache.uima.UIMAFramework.
> produceCollectionProcessingEngine(UIMAFramework.java:918)
>             at org.apache.uima.tools.cpm.CpmP
> anel.startProcessing(CpmPanel.java:573)
>             at org.apache.uima.tools.cpm.CpmP
> anel.access$000(CpmPanel.java:105)
>             at org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713)
> Caused by: org.apache.uima.resource.ResourceConfigurationException
>             at org.apache.uima.collection.imp
> l.cpm.container.CPEFactory.produceIntegratedCasProcessor(CPE
> Factory.java:1093)
>             at org.apache.uima.collection.imp
> l.cpm.container.CPEFactory.getCasProcessors(CPEFactory.java:547)
>             at org.apache.uima.collection.imp
> l.cpm.BaseCPMImpl.init(BaseCPMImpl.java:253)
>             at org.apache.uima.collection.imp
> l.cpm.BaseCPMImpl.<init>(BaseCPMImpl.java:127)
>             at org.apache.uima.collection.imp
> l.CollectionProcessingEngine_impl.initialize(CollectionProce
> ssingEngine_impl.java:73)
>             ... 5 more
> Caused by: java.lang.Exception: The component XMI Writer CAS Consumer
> cannot be created. (Thread Name: Thread-5)
>             ... 10 more
> Attempted Solutions
> I only found one guy with the same problem as me. The solution proposed in
> the thread, by Sean Finan, was to change the xml of my consumer
> (__XmiWriterCasConsumer.xml), particularly the content of the tag
> <implementationName>, from
>  <implementationName>org.apache.ctakes.core.cc<https://urlde
> fense.proofpoint.com/v2/url?u=http-3A__apache.ctakes.core.
> cc&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=
> fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=55lXUJ1MFPyBhp
> 3N9V-kjGKPhqaNA&e=>.XmiWriterCasConsumerCtakes</implementationName>
> to
> <implementationName>org.apache.uima.tools.components.XmiWrit
> erCasConsumer</implementationName>
> However, this didn't work. The error is exactly the same. I'm out of ideas
> about what to do. I would like to have the report of CPE in XMI, in order
> to read it with CVD. You can see the thread here:
> http://mail-archives.apache.org/mod_mbox/ctakes-dev/201701.
> mbox/%3C29cefd1fa1b44ce4a8dc92ec8b1cd882@CHEXMAIL1A.CHBOSTON.ORG%3E<
> https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-
> 2Darchives.apache.org_mod-5Fmbox_ctakes-2Ddev_201701.
> mbox_-253C29cefd1fa1b44ce4a8dc92ec8b1cd882-40CHEXMAIL1A.
> CHBOSTON.ORG-253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZM
> SdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&
> m=55lXUJ1MFPyBhpVH4sCBuEZD-InGrPRtD4YTvCJpMFo&s=vzHmir9t5IBn
> cKpumZCOCqviJeDNNVl4ZkjEiK9AMp8&e=><https://urldefense.
> proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.
> org_mod-5Fmbox_ctakes-2Ddev_201701.mbox_-253C29cefd1fa1b44
> ce4a8dc92ec8b1cd882-40CHEXMAIL1A.CHBOSTON.ORG-
> 253E&d=DwMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&
> r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=N5zX2YGt7jbG
> KsiWAN7z5tdADmV2PwJdHTvvx2oZ2fM&s=5c-Yr8TMBg7-VyEjwF7gJlT1xP
> 3LpHC6dvnZbihxDPg&e=>
> Result Expected
> Running the CPE process and have outputs as XMI files.
> Result Obtained
> Running the CPE results in an error, specifically for the consumer
> __XMIWriterCasConsumer.
> Conclusion
> Do any of you guys had this problem before? Do you have a suggestion about
> how can it be solved? Thanks a lot
> Best regards,
> Manuel

  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message