ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Spurrier <robert.spurr...@explorys.com>
Subject Creating Runnable .JARs From A Subset of cTAKES Maven Modules
Date Mon, 09 Sep 2013 13:25:35 GMT
Good Morning!

I am trying to use cTAKES tools on a distributed computing platform. I would rather not ship
the entire compiled cTAKES package (~1.5 Gb) out to the shared cache when I only need a few
annotators and their resources at a time.

I should first mention that I am not very familiar with Maven. I recently upgraded cTAKES
from v 2.5.0, where I was configuring smaller pipelines using ant build files. This process
was cumbersome however, and I can appreciate the new modular Maven project layout.  I just
do not know how to effectively utilize it in a way that is flexible.

Does anyone have any advice on how I can package subsets of cTAKES annotator modules and their
dependencies/resources, so  I can create 'thinner' custom pipelines that are geared towards
specific tasks?

For example, I might ultimately want a pipeline .JAR that contains the tools to RegEx Left
Ventricular Ejection Fraction measurements from free text. In such a .JAR I would not need
any of the dictionary resources or negation annotators, so they could be excluded.

It looks like I could create Maven assembly plugin descriptors to generate these custom .JARs,
but I would like to see if anyone here has any advice/caveats before I pursue this route.

Robert Spurrier

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message