ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeffrey Miller <jeff...@gmail.com>
Subject Re: ctake web service [EXTERNAL]
Date Sat, 09 Mar 2019 17:20:04 GMT
Thanks for your response Sean- we are still working on this (and have some
things to look into given your last response), but I will share details
when we have it working. We are still deciding on whether to use Spark or
Apache Beam.

Just to clarify my previous confusion, I assumed the TS wrappers were so
you could avoid creating multiple pipelines and just run one instance of
the pipeline with a separate JCAS per thread. I thought the main motivation
behind that would be to avoid loading >1 dictionaries into memory, for
example. But it sounds like I was mistaken. With respect to sharing
resources, are static variables the main concern? Do you know if this is a
problem for any of the annotators in the default clinical pipeline (the
regular components, not the thread safe ones)? From Peter's response (I am
not sure if that split off into another forum thread because the subject
changed), it sounds like it may not be a problem? I'd like to really
understand thread-safe with respect to core cTAKES components (with the
caveat that community-created annotators could be implemented in any number
of ways, making it hard to declare cTAKES is "thread-safe"). I'd be happy
to contribute documentation back to the wiki once I feel I have a solid
grasp on it.

Peter- have you made your pipeline pool code available anywhere?

On Fri, Mar 8, 2019 at 12:49 PM Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Hi all,
>
> >Is there any known reason that you can't create a pipeline pool, but keep
> everything in the same process?
> -- No, but ...
> > Is it safe to load multiple pipelines in
> the same process as long as only one thread can access each one at a time
> (we plan to use this in a Spark pipeline).
> -- If you are talking about oob ctakes being the process, only a single
> pipeline will run on multiple threads.  The threads will share resources,
> static variables, etc. and the  pipeline will give you terrible results and
> very quickly crash.  That is why I wrote the thread-safe wrappers.
> -- That being said, supposedly you can configure spark to handle this by
> keeping everything contained in a unique copy per thread.  Sort of like
> ThreadLocal (I think), but more effective on a full-pipeline level.
>
> > it must have reduced the DefaultJCasTermAnnotator to a singleton object
> in memory.
> -- Yes.  The thread-safe pipeline is not meant to have siblings in the
> same process - the wrappers can only do so much.  That being said, I am
> pretty sure that the Default... is thread-safe so it doesn't actually need
> the wrapper.  Regardless, the rest of the pipeline would crash.
>
> Jeff, can you share information about your efforts on spark?  If we could
> get that working and in standard ctakes it would be fantastic.
>
> I hope that this information is useful.
>
> Sean
>
>
>
> ________________________________________
> From: Jeffrey Miller <jeffmax@gmail.com>
> Sent: Friday, March 8, 2019 11:23 AM
> To: dev@ctakes.apache.org
> Subject: Re: ctake web service [EXTERNAL]
>
> Is there any known reason that you can't create a pipeline pool, but keep
> everything in the same process? Is it safe to load multiple pipelines in
> the same process as long as only one thread can access each one at a time
> (we plan to use this in a Spark pipeline). One caveat I have noticed- it
> seems like if I use the thread safe components to build a pipeline pool,
> only one dictionary for the DefaultJCasTermAnnotator can be loaded per
> process. For example, I was trying to take advantage of the ability to
> switch pipelines via a query parameter that is suggested at in the code for
> the rest service. The two pipelines used different ontology dictionaries,
> but it seemed like with the thread safe components it must have reduced
> the DefaultJCasTermAnnotator to a singleton object in memory, because it
> only used the first dictionary instantiated. Either way, given how Sean
> described how the thread safe components worked above, you probably
> wouldn't want to use them in a pipeline pool, assuming that the problems
> with threading was limited to multiple threads access the same pipeline at
> the same time, and not having multiple pipelines loaded into memory each
> accessed by only a single thread.
>
> On Fri, Mar 8, 2019 at 11:06 AM Kathy Ferro <healthcare1111@gmail.com>
> wrote:
>
> > I thought about creating a queue that acts as traffic cop.  Only the
> > traffic cop calls the WS.  I also want to test multiple WS running on
> > different port.  Traffic cop calls which every WS is available and keep
> > track of WS statuses.  With all this processing going, it might kill the
> > power for blocks.
> >
> > On Fri, Mar 8, 2019 at 10:34 AM Finan, Sean <
> > Sean.Finan@childrens.harvard.edu> wrote:
> >
> > > Hi all,
> > >
> > > I guess that a quick test could be run with a multi-threaded pipeline.
> > > Tim, for some reason I recall you checking in one with a dockerfile.
> > Maybe
> > > not, and it might not be the default in the service.  Anyway, you could
> > set
> > > the procs to something like 50 and throw 50 users at it.  It definitely
> > > does not scale anything close to linearly.  ctakes aes aren't build for
> > > thread-safety, so they are all wrapped with locks and there is a lot of
> > > thread contention.  However, running such a test might indicate the
> > source
> > > of the problem.
> > >
> > > The other option is to create a queue that collects post calls and
> doles
> > > them out serially to a single pipeline.  User #50 would probably not
> > > appreciate it though ...
> > > ________________________________________
> > > From: gandhi rajan <gandhirajan.n@gmail.com>
> > > Sent: Friday, March 8, 2019 10:02 AM
> > > To: dev@ctakes.apache.org
> > > Subject: Re: ctake web service [EXTERNAL]
> > >
> > > Hi Kathy,
> > >
> > > I guess the initializations happens in post construct method. So if we
> > > could synchronize that I feel we can get away from the problem.
> > > Unfortunately I m not able to tet this as my setup is gone with my old
> > job.
> > > Try it out.
> > >
> > > Regards,
> > > Gandhi.
> > >
> > > On Friday, March 8, 2019, Kathy Ferro <healthcare1111@gmail.com>
> wrote:
> > >
> > > > Tim,
> > > >
> > > > Thanks for reply.  I'm continuing the research.  With all the layers
> > that
> > > > wrap around this, you would think we can handle this suggestion.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Thu, Mar 7, 2019 at 8:01 PM Miller, Timothy <
> > > > Timothy.Miller@childrens.harvard.edu> wrote:
> > > >
> > > > > That's a good question that I've also heard from others, and
> > > > unfortunately
> > > > > I don't know the answer. My use cases are typically a single job
> at a
> > > > time
> > > > > making sequential calls, so I wasn't stressing it with multiple
> > > > > asynchronous calls. I would've thought that the Tomcat container
> > would
> > > > have
> > > > > some ability to manage that though!
> > > > > Tim
> > > > >
> > > > > ________________________________________
> > > > > From: Kathy Ferro <healthcare1111@gmail.com>
> > > > > Sent: Thursday, March 7, 2019 6:10 PM
> > > > > To: dev@ctakes.apache.org
> > > > > Subject: Re: ctake web service [EXTERNAL]
> > > > >
> > > > > Tim,
> > > > >
> > > > > Does docker solution handle multiple instances?  I tested the Rest
> > Web
> > > > > Service with 2 requests at the same time, it errors out.  I removed
> > the
> > > > > part that write the result xml file to the disc; it still error
> out.
> > > > >
> > > > > Best,
> > > > > Kathy
> > > > >
> > > > > On Mon, Mar 4, 2019 at 10:52 AM Miller, Timothy <
> > > > > Timothy.Miller@childrens.harvard.edu> wrote:
> > > > >
> > > > > > I don't know what the solution was, but I leave my ctakes REST
> > server
> > > > > > running basically full time and haven't seen time outs yet.
> > > > > > Tim
> > > > > >
> > > > > > ________________________________________
> > > > > > From: gandhi rajan <gandhirajan.n@gmail.com>
> > > > > > Sent: Monday, March 4, 2019 10:43 AM
> > > > > > To: dev@ctakes.apache.org
> > > > > > Subject: Re: ctake web service [EXTERNAL]
> > > > > >
> > > > > > Hi Kathy, Sean did respond that there is no timeout happening
> from
> > > > cTAKES
> > > > > > end. You might probably have to look at database settings for
> this
> > > > closed
> > > > > > connection issue.
> > > > > >
> > > > > > Does someone have any clue on this?
> > > > > >
> > > > > > On Monday, March 4, 2019, Kathy Ferro <healthcare1111@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > Gandhi,
> > > > > > >
> > > > > > > Do you get any response to this issue?  Does it try to
keep the
> > > > > > connection
> > > > > > > open while WS is up? Or does it open and close after it's
done?
> > > > > > >
> > > > > > > We are still getting this error.
> > > > > > > "ERROR JdbcRareWordDictionary - No operations allowed after
> > > statement
> > > > > > > closed."
> > > > > > >
> > > > > > > Thanks
> > > > > > > Kathy
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Aug 17, 2018 at 9:43 AM Gandhi Rajan Natarajan
<
> > > > > > > Gandhi.Natarajan@arisglobal.com> wrote:
> > > > > > >
> > > > > > > > Hi Kathy,
> > > > > > > >
> > > > > > > > Sometime back we encountered this issue and the problem
seems
> > to
> > > be
> > > > > DB
> > > > > > > > connections getting timed out.
> > > > > > > >
> > > > > > > > Currently we are using the following implementations:
> > > > > > > >
> > > > > > "org.apache.ctakes.dictionary.lookup2.dictionary.
> > > > JdbcRareWordDictionary"
> > > > > > > > and "org.apache.ctakes.dictionary.lookup2.concept.
> > > > JdbcConceptFactory"
> > > > > > > >
> > > > > > > > Does anybody aware of any timeout settings that needs
to be
> > done
> > > in
> > > > > > these
> > > > > > > > implementations to avoid DB connection timeout issue?
> > > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Kathy Ferro <healthcare1111@gmail.com>
> > > > > > > > Sent: Thursday, August 16, 2018 11:07 PM
> > > > > > > > To: dev@ctakes.apache.org
> > > > > > > > Subject: ctake web service
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > Just want to see if anybody has experience this issue.
> > > > > > > >
> > > > > > > > If the web service had been up for a day or two, it
will drop
> > the
> > > > > > > > dictionary lookup.  The only result it returns are
> > > > > ConllDependencyNode
> > > > > > > tag
> > > > > > > > in the xmi file;  no mention, no concept, etc...
> > > > > > > >
> > > > > > > > I haven't have a chance to investigate it, yet.
> > > > > > > >
> > > > > > > > Kathy
> > > > > > > > This email and any files transmitted with it are confidential
> > and
> > > > > > > intended
> > > > > > > > solely for the use of the individual or entity to
whom they
> are
> > > > > > > addressed.
> > > > > > > > If you are not the named addressee you should not
> disseminate,
> > > > > > distribute
> > > > > > > > or copy this e-mail. Please notify the sender or system
> manager
> > > by
> > > > > > email
> > > > > > > > immediately if you have received this e-mail by mistake
and
> > > delete
> > > > > this
> > > > > > > > e-mail from your system. If you are not the intended
> recipient
> > > you
> > > > > are
> > > > > > > > notified that disclosing, copying, distributing or
taking any
> > > > action
> > > > > in
> > > > > > > > reliance on the contents of this information is strictly
> > > prohibited
> > > > > and
> > > > > > > > against the law.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards,
> > > > > > Gandhi
> > > > > >
> > > > > > "The best way to find urself is to lose urself in the service
of
> > > others
> > > > > > !!!"
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Regards,
> > > Gandhi
> > >
> > > "The best way to find urself is to lose urself in the service of others
> > > !!!"
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message