uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl <pklu...@uni-wuerzburg.de>
Subject Re: UIMA Ruta next steps
Date Tue, 07 Jan 2014 10:13:47 GMT
Hi,

I wonder if the next version should be 2.2.0 instead of 2.1.1 since the
new import syntax and functionality is not a small change and the
improvements in UIMA-2332 will maybe have a obvious impact for the users.

Any opinions?

Peter

Am 19.12.2013 15:28, schrieb Peter Klügl:
> Hi,
>
> I just want to start a discussion about the next release and maybe
> interesting directions for extensions.
>
> I am planning a bugfix release for the end of January, UIMA Ruta version
> 2.1.1
>
> List of the 26 already resolved issues for 2.1.1:
> https://issues.apache.org/jira/browse/UIMA-3342?jql=project%20%3D%20UIMA%20AND%20fixVersion%20%3D%20%222.1.1ruta%22%20AND%20component%20%3D%20ruta%20AND%20status%20in%20(Resolved%2C%20Closed)%20ORDER%20BY%20priority%20DESC
>
> List of currently unresolved issues:
> https://issues.apache.org/jira/browse/UIMA-2982?jql=project%20%3D%20UIMA%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20ruta%20ORDER%20BY%20priority%20DESC
>
> I think the following issues should (at least) be resolved in addition
> for 2.1.1 (some of them are already fixed, but the documentation is not
> yet up-to-date):
> - UIMA-3137: Cleanup Ruta launch configuration tabs
> - UIMA-3471: Arrays in Annotation Browser View
> - UIMA-3347: Ruta: Missing False Positives in "Annotation Test" view
> - UIMA-3286: Start anchor after optional literal
> - UIMA-3280: Option to specify vm arguments for Ruta launch config
> - UIMA-3283: Matching reference pointing outside of current window
> - UIMA-3303: Add a way to alias types in RUTA (e.g. "IMPORT type AS alias")
> - UIMA-3495: Report ambiguous types in Ruta Editor
> - UIMA-3441: Ruta: Extend classpath for Annotation Test run
> - UIMA-3469: Ruta: Annotation Browser View Extensions
> - UIMA-3275: Minor discrepencies in license and notice files
> - UIMA-3309: Ruta: Filter file names in Query View
> - UIMA-3485: Ruta: Workbench extension point for "Script execution finished"
>
> Maybe the issues for dropins-support should also be included.
>
> Are there any wishes/opinions which other issues should be included?
>
> ###
>
> Here are a few ideas of major changes for a 2.2.x or 3.x release:
>
> 1. Making UIMA Ruta faster
> There are four aspects that can be considered:
> a) Parallelization/Scale-Out, already supported by UIMA-AS and friends
> b) Improvements in the current implementation. I know of at least four
> code fragments that can be improved
> c) Add new language constructs that are simply faster in some
> situations. I am thinking of an FST implementation similar to JAPE Plus
> and of an extension of the dynamic anchoring towards the operator plan
> optimization of SystemT
> d) Write faster rules. Some rules are just faster than others. This
> leads to a cookbook for best practices
>
> 2. Improve support for coreference information
> There are some nice ideas of unification-based grammars that can be
> included in the rule language. It does not have to be as mature as in
> SProUT, but maybe something like in CAFETIERE. This would automatically
> solve the restriction of value assignments in actions vs conditions
>
> 3. Support arbitrary CAS collections in the Ruta Workbench
> The Workbench currently only supports normal xmi files. There is no
> concept of a collection reader or similar stuff. It would maybe be nice
> for some users, if the Workbench can operate on CASs stored in a
> database or on any collection reader.
>
> 4. Actually useful rule induction algorithm
> After about six implementations of supervised rule learners, I think I
> have an idea of the layout of an actually useful algorithm for Ruta. I
> think it's also the time to adapt some ideas of semi-supervised machine
> learning for rule-based systems.
>
> 5. Support generic type systems in the Workbench
> Sometimes you cannot avoid specifying the semantics of an annotation in
> the feature values instead of in the type. However, most of the tooling
> will be not as useful then, e.g., the Annotation Browser view shows only
> one type with a lot of annotations. There should be some additional,
> configurable views that support those situations.
>
>
> All opinions or wishes are welcome :-)
>
> Best,
>
> Peter
>


Mime
View raw message