uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject Re: [VOTE] Release Apache UIMA Ruta 2.4.0 RC3
Date Mon, 08 Feb 2016 09:44:29 GMT
On 08.02.2016, at 10:11, Peter Klügl <peter.kluegl@averbis.com> wrote:
> Hi,
> Am 07.02.2016 um 19:52 schrieb Richard Eckart de Castilho:
>> Checks:
>> - compared POMs in 2.3.0 svn tag against 2.4.0 tag: no new dependencies - OK
>> - the FirstNames.txt file in GermanNovels is quite large 90k, but no source info/license
for this file is given anywhere: doesn't seem OK
>> - stopping checks at this point for the moment
> What kind of source info/license would you expect? The file together
> with the other files was contributed as part of UIMA-3926 with an ICLA
> present. I do not remember if I knew the source of the file by then, but
> I remember that I had some conversations with the contributor that the
> files need to be OK for a contribution. That's the reason why the
> test/dev data was not contributed since it had some CC license that was
> problematic.

The other dev/test data doesn't seem problematic at all, but the 90k names
file seems non-trivial. If it were CC, the license would need to be mentioned
in a LICENSE.txt file. My suggestion would be to simply strip the file down
to the names needed for the example.

>> Questions:
>> - several files in the GermanNovels resources have the first word duplicated, they
also start with a BOM - necessary?
> The BOM is the reason why the wordlists contain duplicate entries in the
> beginning. This is an open issue [UIMA-3778]. The BOM is not necessary,
> but was simply not removed.

Ok, so this is on the radar.

>> - is it necessary that the GermanNovels example contains GeneratedDKPRoCoreTypes.xml
- can these not be obtained through Maven? If it is necessary, provenance information would
be good.
> It was necessary when the rules were contributed, but it would be
> possible now with some new features. I do not have the time to upgrade
> the project (its priority is too low and it would require to change the
> tutorial). I could add provenance information. I assume that it should
> be in a README file but not in the NOTICE file, or is there an issue
> converning the DKPro type systems?

The DKPro Core type systems are covered by ASL (although for some reason
there are no ASL headers in the original DKPro Core XML files...). So
in principle there is no problem, and because the original files don't have
ASL headers, they also were not stripped by the aggregation process - again
no problem.

My understanding is that staying strict to the Apache rules, the contents
of the NOTICE of the DKPro Core artifacts from which the types were
obtained would need to be copied into a NOTICE within the examples
project. If Ruta could obtain the types directly from the Maven artifacts,
the types file and NOTICE inclusion would not be necessary.

If we all agree that this should be fixed for the next release after 2.4.0,
I'd be ok for me. I am making these comments definitely with the Apache 
hat on, not with the DKPro Core hat on.

>> Comments:
>> - tutorial-GermanNovels.tex is written in German, not English. 
> That was the target group of the tutorial and there was no volunteer to
> translate it.


>> So much so far.
>> Cheers,
>> -- Richard
>>> On 29.01.2016, at 11:25, Peter Klügl <peter.kluegl@averbis.com> wrote:
>>> Hi,
>>> the third release candidate of Apache UIMA Ruta v2.4.0 is ready for voting.
>>> Changes rc2 -> rc3:
>>> - UIMA-4758 - Ruta: reluctant qualifier right to left lookahead to
>>> literal string expression matcher
>>> - UIMA-4768 - Ruta: generic argument for aliases type interpreted as
>>> generic feature expression
>>> Changes rc1 -> rc2:
>>> - UIMA-4760 - Ruta: duplicate verbalization of type in type matcher
>>> General information:
>>> This release contains many nice and useful new features and additionally
>>> fixed many annoying bugs. Here's a short overview of the main changes:
>>> - Explicit referencing of annotations with variables, labels and addresses
>>> - Helper methods for applying rules directly in Java code
>>> - Macros for conditions and actions (prototypical)
>>> - Limited support of UIMA arrays (prototypical)
>>> - New action for splitting annotations
>>> - New block for resetting match context
>>> - Import of uimaFIT analysis engines with manditory parameters
>>> - Many, many bug fixes and other improvements
>>> Staging repository:
>>> https://repository.apache.org/content/repositories/orgapacheuima-1083/
>>> SVN tag:
>>> https://svn.apache.org/repos/asf/uima/ruta/tags/ruta-2.4.0
>>> Update site:
>>> https://dist.apache.org/repos/dist/dev/uima/ruta-2.4.0-rc3/eclipse-update-site/ruta/
>>> Archive with all sources:
>>> https://dist.apache.org/repos/dist/dev/uima/ruta-2.4.0-rc3/source-release/ruta-2.4.0-source-release.zip
>>> Overall 52 issues have been fixed for this release (one of them with
>>> "Cannot Reproduce").
>>> They can be found in the RELEASE_NOTES.html.
>>> ... and here:
>>> https://issues.apache.org/jira/issues/?filter=12333870&jql=project%20%3D%20UIMA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%202.4.0ruta%20AND%20component%20%3D%20ruta%20ORDER%20BY%20priority%20DESC%2C%20updated%20ASC%2C%20created%20DESC
>>> Please vote on release:
>>> [ ] +1 OK to release
>>> [ ]  0 Don't care
>>> [ ] -1 Not OK to release, because ...
>>> Thanks.
>>> Peter

View raw message