uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl <peter.klu...@averbis.com>
Subject Re: [VOTE] Release Apache UIMA Ruta 2.4.0 RC3
Date Mon, 08 Feb 2016 10:11:41 GMT

Am 08.02.2016 um 10:44 schrieb Richard Eckart de Castilho:
> On 08.02.2016, at 10:11, Peter Klügl <peter.kluegl@averbis.com> wrote:
>> Hi,
>> Am 07.02.2016 um 19:52 schrieb Richard Eckart de Castilho:
>>> Checks:
>>> - compared POMs in 2.3.0 svn tag against 2.4.0 tag: no new dependencies - OK
>>> - the FirstNames.txt file in GermanNovels is quite large 90k, but no source info/license
for this file is given anywhere: doesn't seem OK
>>> - stopping checks at this point for the moment
>> What kind of source info/license would you expect? The file together
>> with the other files was contributed as part of UIMA-3926 with an ICLA
>> present. I do not remember if I knew the source of the file by then, but
>> I remember that I had some conversations with the contributor that the
>> files need to be OK for a contribution. That's the reason why the
>> test/dev data was not contributed since it had some CC license that was
>> problematic.
> The other dev/test data doesn't seem problematic at all, but the 90k names
> file seems non-trivial. If it were CC, the license would need to be mentioned
> in a LICENSE.txt file. My suggestion would be to simply strip the file down
> to the names needed for the example.

If I have to guess I'd say that the names have been crawled and that
there is no original source file with a specific license.

The novels had the CC license last time I checked. I do not remember
all, but when I looked it up in Apache's third party pages, it indicated
that it was not possible to include them. However, I could have been wrong.

Hmm... it depends what is needed for the example. The initial example
were 10-20 novels. I could strip it down to the firstnames of one novel
I remember to be part of the dev set, but is that really necessary?

>>> Questions:
>>> - several files in the GermanNovels resources have the first word duplicated,
they also start with a BOM - necessary?
>> The BOM is the reason why the wordlists contain duplicate entries in the
>> beginning. This is an open issue [UIMA-3778]. The BOM is not necessary,
>> but was simply not removed.
> Ok, so this is on the radar.
>>> - is it necessary that the GermanNovels example contains GeneratedDKPRoCoreTypes.xml
- can these not be obtained through Maven? If it is necessary, provenance information would
be good.
>> It was necessary when the rules were contributed, but it would be
>> possible now with some new features. I do not have the time to upgrade
>> the project (its priority is too low and it would require to change the
>> tutorial). I could add provenance information. I assume that it should
>> be in a README file but not in the NOTICE file, or is there an issue
>> converning the DKPro type systems?
> The DKPro Core type systems are covered by ASL (although for some reason
> there are no ASL headers in the original DKPro Core XML files...). So
> in principle there is no problem, and because the original files don't have
> ASL headers, they also were not stripped by the aggregation process - again
> no problem.
> My understanding is that staying strict to the Apache rules, the contents
> of the NOTICE of the DKPro Core artifacts from which the types were
> obtained would need to be copied into a NOTICE within the examples
> project. If Ruta could obtain the types directly from the Maven artifacts,
> the types file and NOTICE inclusion would not be necessary.
> If we all agree that this should be fixed for the next release after 2.4.0,
> I'd be ok for me. I am making these comments definitely with the Apache 
> hat on, not with the DKPro Core hat on.

I created an issue for it: https://issues.apache.org/jira/browse/UIMA-4789



>>> Comments:
>>> - tutorial-GermanNovels.tex is written in German, not English. 
>> That was the target group of the tutorial and there was no volunteer to
>> translate it.
> Ok.
>>> So much so far.
>>> Cheers,
>>> -- Richard
>>>> On 29.01.2016, at 11:25, Peter Klügl <peter.kluegl@averbis.com> wrote:
>>>> Hi,
>>>> the third release candidate of Apache UIMA Ruta v2.4.0 is ready for voting.
>>>> Changes rc2 -> rc3:
>>>> - UIMA-4758 - Ruta: reluctant qualifier right to left lookahead to
>>>> literal string expression matcher
>>>> - UIMA-4768 - Ruta: generic argument for aliases type interpreted as
>>>> generic feature expression
>>>> Changes rc1 -> rc2:
>>>> - UIMA-4760 - Ruta: duplicate verbalization of type in type matcher
>>>> General information:
>>>> This release contains many nice and useful new features and additionally
>>>> fixed many annoying bugs. Here's a short overview of the main changes:
>>>> - Explicit referencing of annotations with variables, labels and addresses
>>>> - Helper methods for applying rules directly in Java code
>>>> - Macros for conditions and actions (prototypical)
>>>> - Limited support of UIMA arrays (prototypical)
>>>> - New action for splitting annotations
>>>> - New block for resetting match context
>>>> - Import of uimaFIT analysis engines with manditory parameters
>>>> - Many, many bug fixes and other improvements
>>>> Staging repository:
>>>> https://repository.apache.org/content/repositories/orgapacheuima-1083/
>>>> SVN tag:
>>>> https://svn.apache.org/repos/asf/uima/ruta/tags/ruta-2.4.0
>>>> Update site:
>>>> https://dist.apache.org/repos/dist/dev/uima/ruta-2.4.0-rc3/eclipse-update-site/ruta/
>>>> Archive with all sources:
>>>> https://dist.apache.org/repos/dist/dev/uima/ruta-2.4.0-rc3/source-release/ruta-2.4.0-source-release.zip
>>>> Overall 52 issues have been fixed for this release (one of them with
>>>> "Cannot Reproduce").
>>>> They can be found in the RELEASE_NOTES.html.
>>>> ... and here:
>>>> https://issues.apache.org/jira/issues/?filter=12333870&jql=project%20%3D%20UIMA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%202.4.0ruta%20AND%20component%20%3D%20ruta%20ORDER%20BY%20priority%20DESC%2C%20updated%20ASC%2C%20created%20DESC
>>>> Please vote on release:
>>>> [ ] +1 OK to release
>>>> [ ]  0 Don't care
>>>> [ ] -1 Not OK to release, because ...
>>>> Thanks.
>>>> Peter

View raw message