uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Hernandez <nicolas.hernan...@gmail.com>
Subject Re: Guidelines for a mutual contribution
Date Wed, 18 May 2011 15:04:25 GMT
Dear All,

I come back one year later...

To remind you, we used a French Treebank corpus
(http://www.llf.cnrs.fr/Gens/Abeille/French-Treebank-fr.php) to train
models for processing French with the HMM tagger addon.
I first contacted you for some advices since we did not own the
resource we used and we were not sure to be allowed to distribute our
models under Apache license. We were discussing about with the
resource owner and we though that an alternative way to distribute the
models we trained could be to jointly submit the models.

Eventually, we got the grant from the owner to distribute the models
we built up under the Apache License v2.

In short, we built up French models for part of speech (pos),
morphological (mph) and function grammatical (fct) tagging, as well as
lemmatization (lemma). We use the Hmm tagger to perform the various
tagging. A recent patch has been submitted to turn the Hmm tagger into
a less type system dependant tagger.
See https://issues.apache.org/jira/browse/UIMA-2110

Before submitting the models to the project, I have some new
questions. As a researcher it is important for us that our work be
cited by other researchers. In addition, the models are only a few
files but they represent a substantial contribution for the French
Natural Language Processing community.

So I was wondering whether you still advise me to perform the IP
clearance procedure or just to add a specific mention in the NOTICE
file.

In the first case, could you find me an "appropriate volunter" for
executing the IP Clearance processing?

Another "substantial" question... our model files takes about 5 Mo
each (pos, mph and fct) except the lemma model file which takes 24 Mo.
Alternatively we built up a merged model for pos, mph and fct which
takes 6.9 Mo. Do you thing it may cause a problem if we submit all of
them?

Best regards

/Nicolas

---------- Forwarded message ----------
From: Nicolas Hernandez <nicolas.hernandez@gmail.com>
Date: Thu, Nov 4, 2010 at 11:28 AM
Subject: Re: Guidelines for a mutual contribution
To: dev@uima.apache.org


Thilo, we would like to submit a language model which was trained on a
French Treebank corpus for the tagger addon. We do not own the
treebank corpus we used. We are in discussion with her owner to know
if we still respect the treebank License by distributing a model built
on it under the Apache License.
We though that an alternative way to distribute the model we trained
could be to jointly submit the model with the owner of the treebank.

Marshal, I will consult all the links you mention and come back if necessary

Thanks

On Thu, Nov 4, 2010 at 11:06 AM, Marshall Schor <msa@schor.com> wrote:
>
>
> On 11/4/2010 5:06 AM, Nicolas Hernandez wrote:
>> Hi
>>
>> Can someone indicate me where to find some guidelines to commit a
>> mutual contribution? In other words, how to proceed when there is two
>> developers or corporations involved in a work they would like to
>> commit ?
>> I do not find any information on this subject on
>> http://www.apache.org/licenses/ neither on
>> http://uima.apache.org/contribution-policy.html
>>
>> Do we have to submit each of us an "Individual Contributor License
>> Agreement" to the ASF
>
> Each person has to have an "Individual Contributor License Agreement" on file
> with the ASF (and, if appropriate, a Corporate Contribution License Agreement
> (see http://www.apache.org/licenses/ and search for Corporate CLA).
>
> When you post the contribution, attach it to a Jira and state in the Jira itself
> what you are doing, including granting the ASF a license under the Apache
> Software License version 2.0).
>
> If the contribution represents "substantial" work developed outside of the ASF's
> normal process, it will need to go through the IP clearance process, as Tommaso
> described.
>>  and specify clearly in the NOTICE file of our
>> contribution the complete attribution ?
>
> Here's info to what goes in the Notice file:
>
> http://www.apache.org/legal/src-headers.html#notice
>
> and here's a link which says that the ASF prefers if the contributors do not put
> individual copyright statements into the file:
>
> http://www.apache.org/dev/apply-license.html#contributor-copyright - linking to
> this in particular about moving existing copyright from source into the Notice file:
>
> http://www.apache.org/legal/src-headers.html#header-existingcopyright
>
> Does this answer your question?
>
> -Marshall Schor
>> Thanks in advance
>>
>> /Nicolas
>>
>



--
Nicolas.Hernandez@univ-nantes.fr
--
http://enicolashernandez.blogspot.com
http://www.univ-nantes.fr/hernandez-n
--
# Laboratoire LINA-TALN CNRS UMR 6241
tel. +33 (0)2 51 12 58 55
# Université de Nantes - Institut Universitaire de Technologie -
Département Informatique
tel. +33 (0)2 40 30 60 67



-- 
nicolas.hernandez@univ-nantes.fr
#
http://enicolashernandez.blogspot.com
http://www.univ-nantes.fr/hernandez-n
#
Laboratoire LINA-TALN CNRS UMR 6241
tel. +33 (0)2 51 12 58 55
#
Université de Nantes - Institut Universitaire de Technologie -
Département Informatique
tel. +33 (0)2 40 30 60 67

Mime
View raw message