uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tommaso Teofili <tommaso.teof...@gmail.com>
Subject Re: Guidelines for a mutual contribution
Date Thu, 19 May 2011 07:00:23 GMT
Hello Nicolas,

2011/5/18 Nicolas Hernandez <nicolas.hernandez@gmail.com>

> Dear All,
>
> I come back one year later...
>
> To remind you, we used a French Treebank corpus
> (http://www.llf.cnrs.fr/Gens/Abeille/French-Treebank-fr.php) to train
> models for processing French with the HMM tagger addon.
> I first contacted you for some advices since we did not own the
> resource we used and we were not sure to be allowed to distribute our
> models under Apache license. We were discussing about with the
> resource owner and we though that an alternative way to distribute the
> models we trained could be to jointly submit the models.
>
> Eventually, we got the grant from the owner to distribute the models
> we built up under the Apache License v2.
>
> In short, we built up French models for part of speech (pos),
> morphological (mph) and function grammatical (fct) tagging, as well as
> lemmatization (lemma). We use the Hmm tagger to perform the various
> tagging. A recent patch has been submitted to turn the Hmm tagger into
> a less type system dependant tagger.
> See https://issues.apache.org/jira/browse/UIMA-2110


as I said in my last comment it'd be also nice to see some documentation on
how you created the models so that more users can create models with it.


>
>
> Before submitting the models to the project, I have some new
> questions. As a researcher it is important for us that our work be
> cited by other researchers. In addition, the models are only a few
> files but they represent a substantial contribution for the French
> Natural Language Processing community.
>
> So I was wondering whether you still advise me to perform the IP
> clearance procedure or just to add a specific mention in the NOTICE
> file.
>

If you also plan to donate the models I think the IP clearance is the right
way both for UIMA and for you as a researcher.


>
> In the first case, could you find me an "appropriate volunter" for
> executing the IP Clearance processing?
>

I am working on the UIMA Addons RC2 so can't do it right now but, if no one
is available before that time, I could help you once UIMA Addons release is
done.


>
> Another "substantial" question... our model files takes about 5 Mo
> each (pos, mph and fct) except the lemma model file which takes 24 Mo.
> Alternatively we built up a merged model for pos, mph and fct which
> takes 6.9 Mo. Do you thing it may cause a problem if we submit all of
> them?
>

I don't see any issue with that sizes so, in my opinion, the models can all
be submitted separately.
Regards,
Tommaso


>
> Best regards
>
> /Nicolas
>
> ---------- Forwarded message ----------
> From: Nicolas Hernandez <nicolas.hernandez@gmail.com>
> Date: Thu, Nov 4, 2010 at 11:28 AM
> Subject: Re: Guidelines for a mutual contribution
> To: dev@uima.apache.org
>
>
> Thilo, we would like to submit a language model which was trained on a
> French Treebank corpus for the tagger addon. We do not own the
> treebank corpus we used. We are in discussion with her owner to know
> if we still respect the treebank License by distributing a model built
> on it under the Apache License.
> We though that an alternative way to distribute the model we trained
> could be to jointly submit the model with the owner of the treebank.
>
> Marshal, I will consult all the links you mention and come back if
> necessary
>
> Thanks
>
> On Thu, Nov 4, 2010 at 11:06 AM, Marshall Schor <msa@schor.com> wrote:
> >
> >
> > On 11/4/2010 5:06 AM, Nicolas Hernandez wrote:
> >> Hi
> >>
> >> Can someone indicate me where to find some guidelines to commit a
> >> mutual contribution? In other words, how to proceed when there is two
> >> developers or corporations involved in a work they would like to
> >> commit ?
> >> I do not find any information on this subject on
> >> http://www.apache.org/licenses/ neither on
> >> http://uima.apache.org/contribution-policy.html
> >>
> >> Do we have to submit each of us an "Individual Contributor License
> >> Agreement" to the ASF
> >
> > Each person has to have an "Individual Contributor License Agreement" on
> file
> > with the ASF (and, if appropriate, a Corporate Contribution License
> Agreement
> > (see http://www.apache.org/licenses/ and search for Corporate CLA).
> >
> > When you post the contribution, attach it to a Jira and state in the Jira
> itself
> > what you are doing, including granting the ASF a license under the Apache
> > Software License version 2.0).
> >
> > If the contribution represents "substantial" work developed outside of
> the ASF's
> > normal process, it will need to go through the IP clearance process, as
> Tommaso
> > described.
> >>  and specify clearly in the NOTICE file of our
> >> contribution the complete attribution ?
> >
> > Here's info to what goes in the Notice file:
> >
> > http://www.apache.org/legal/src-headers.html#notice
> >
> > and here's a link which says that the ASF prefers if the contributors do
> not put
> > individual copyright statements into the file:
> >
> > http://www.apache.org/dev/apply-license.html#contributor-copyright -
> linking to
> > this in particular about moving existing copyright from source into the
> Notice file:
> >
> > http://www.apache.org/legal/src-headers.html#header-existingcopyright
> >
> > Does this answer your question?
> >
> > -Marshall Schor
> >> Thanks in advance
> >>
> >> /Nicolas
> >>
> >
>
>
>
> --
> Nicolas.Hernandez@univ-nantes.fr
> --
> http://enicolashernandez.blogspot.com
> http://www.univ-nantes.fr/hernandez-n
> --
> # Laboratoire LINA-TALN CNRS UMR 6241
> tel. +33 (0)2 51 12 58 55
> # Université de Nantes - Institut Universitaire de Technologie -
> Département Informatique
> tel. +33 (0)2 40 30 60 67
>
>
>
> --
> nicolas.hernandez@univ-nantes.fr
> #
> http://enicolashernandez.blogspot.com
> http://www.univ-nantes.fr/hernandez-n
> #
> Laboratoire LINA-TALN CNRS UMR 6241
> tel. +33 (0)2 51 12 58 55
> #
> Université de Nantes - Institut Universitaire de Technologie -
> Département Informatique
> tel. +33 (0)2 40 30 60 67
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message