uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: small memory footprint tradeoff configuration
Date Fri, 27 Mar 2009 21:37:09 GMT
Another way to reduce the footprint of UIMA:

One user reported the basic UIMA framework as taking approx. 5 MB (not
sure exactly what was measured).  I investigated to see if UIMA might be
loading more classes than needed.  I found that at startup time, UIMA
reads a factory configuration file and assigns classes to interfaces,
storing these in a hashmap. 

The factory configuration (located in
has specs for things like the collection processing manager. 

The startup code does a Class.forName on these to load them (and confirm
they are present).   This makes Java "lazy loading" not work so well,
since many of these won't be used.  I did a heapdump of a tiny UIMA
application using the
- reading a simple descriptor and running it, and found many classes
pertaining to the CPE (Collection Processing) which my test application
doesn't use. 

I see two possible approaches to improving this: one is having users who
are memory sensitive learn more about the factory configuration file,
and have them remove parts of it that are for things they won't be
using.  I don't much like this approach - it's error prone, especially
over time...

The other approach is to modify the way the factory configuration does
it resolution to make it lazy - for instance, changing it so that only
on first reference to an interface would the corresponding class be
loaded.  This has a potential issue where the failure to find a
particular needed implementation in the class path might happen later in
a run, rather than at the start, but I don't think that's a serious
drawback, compared to the potential footprint reduction.

What do others think?


View raw message