uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: Configuration parameters (was Working on a new API to enable creation of UIMA AS deployment descriptors programmatically)
Date Wed, 22 Feb 2012 18:50:32 GMT
I've collected the various thoughts about configuration parameters (of all 
kinds) into a Wiki page, here:
https://cwiki.apache.org/confluence/display/UIMA/Configuring+UIMA+Pipelines+for+a+particular+run

I've included a straw man set of choices for a particular augmentation of UIMA 
in this regard.

Comments appreciated!

-Marshall

On 2/6/2012 6:14 PM, Burn Lewis wrote:
> I'd like to revive our discussion of August 2011.
>
> The basic goal is to improve the existing envVarRef way of parameterizing
> descriptor configuration parameters which currently allows the values of
> system properties to be imbedded as part of the value of a parameter, e.g.
>
>    <nameValuePair>
>        <name>Threshold</name>
>        <value>
>            <string>
>
> sometext<envVarRef>someEnvironmentParameter</envVarRef>moretext
>            </string>
>        </value>
>    </nameValuePair>
>
>
> Although this is flexible, allowing multiple replacements, it is not
> supported by the CDE since it is evaluated when the descriptor is parsed.
>
> One suggestion is to add a new element under<nameValuePair>  that specifies
> the external name of the parameter, e.g.
>
>    <nameValuePair>
>        <name>Threshold</name>
>        <value>
>            <propertyRef>Filter.threshold</propertyRef>
>        </value>
>    </nameValuePair>
>
> and it has been pointed out that we could allow a default value to be
> defined if the appropriate string, integer, float or boolean element were
> also specified.
>
> I have used the envVarRef feature in a couple of applications and found it
> convenient to include the full- or nick-name of the annotator in the
> variable name, e.g. 'com.acme.engine.filter.threshold' or
> 'Filter.threshold'.  When these are passed in a properties file to UIMA
> they can be viewed as global parameters for an application, and hence can
> be thought of as alternative "global" names for the configuration
> parameters.  From this viewpoint it might be more appropriate to specify
> the name in the parameter declaration section of the descriptors, alongside
> its "local" name, e.g.
>
>    <configurationParameter>
>        <name>Threshold</name>
>        <globalName>Filter.threshold</globalName>
>        <description>Score threshold for filter</description>
>        <type>Float</type>
>    </configurationParameter>
>
> One benefit here is that the settings section for a mandatory parameter
> could be omitted if it were overridden by an aggregate or defined globally.
>
> In the earlier discussions it was suggested that we support more complex
> substitutions in the propertyRef element, e.g.
>
> <propertyref>${baseDirectory}/a/b/${model}</propertyRef>
>
> I no longer think this complexity is very useful or desirable. With
> parameters specified in a properties file, reconstructed in a descriptor,
> and then used in code I think we have too many ways to modify parameters,
> and for clarity I think a one-to-one relationship between external
> properties and internal parameters is preferable.  Note that such
> construction can be done with envVarRef if needed.
>
> How should we handle arrays? (if at all)  Fixed size ones could have
> multiple entries, e.g.
>
>    <nameValuePair>
>        <name>Names</name>
>        <array>
>            <value><propertyRef>Filter.name1</propertyRef></value>
>            <value><propertyRef>Filter.name2</propertyRef></value>
>        </array>
>    </nameValuePair>
>
>
> Variable ones could be set from a comma separated list but might require a
> new<list>  element in the settings, while a list could be the expected
> format for multiValued parameters with a<globalName>  element.
>
> ~Burn
>

Mime
View raw message