commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Fortner <phidia...@gmail.com>
Subject Re: [Laboratory Toolkit] proposing a new Apache Commons component
Date Sat, 07 Dec 2013 14:39:18 GMT
You might also be interested in apache uima which is a popular text mining
platform.

Mark
On Dec 7, 2013 1:49 AM, "Valentin Waeselynck" <valentinwaeselynck@yahoo.fr>
wrote:

> Thanks to all for your interest!
>
> The code examples are on their way, I'm trying to make them as diverse as
> possible. I'll let you know as soon as  they're ready.
>
>
> Thanks for telling me about tika, Oliver, it's very interesting! An
> algorithm that tries to extract the meaning of a variety of documents could
> typically be a combination of tika and the Laboratory Toolkit.
>
> However, the Laboratory Toolkit is less specialized (in fact, it's not
> specialized at all) and less concrete. It is similar in its genericity and
> in the nature of its benefits to, for example, the Executor API in
> java.concurrent. As the Executor API lets you think and design concurrent
> algorithms in terms of tasks and executors, the Laboratory Toolkit lets you
> think and design some other (I haven't found a satisfying description yet)
> algorithms in terms of analyses and laboratories.
>
> Bests,
>
>
> Valentin WAESELYNCK
> Étudiant en 3° année à l'École Polytechnique
> valentin.waeselynck@polytechnique.edu
> +33 6 80 84 99
>  54
>
>
>
>
> Le Vendredi 6 décembre 2013 21h30, Oliver Heger <
> oliver.heger@oliver-heger.de> a écrit :
>
>
>
> Am 05.12.2013 13:44, schrieb Valentin Waeselynck:
> > Hello, and pleased to meet you,
> >
> > Thank you for your answer.
> >
> > I just asked for confirmation, and I do have full intellectual property
> on this software.
> >
> > About the use cases : no problem, I'll include some code samples. As a
> foreword, let's say it provides a convenient API for creating all sorts of
> custom "information extraction" algorithms.
> If the library is about information extraction, you may also want to
> have a look at the Apache Tika project [1].
>
> Oliver
>
> [1] http://tika.apache.org/
>
> >
> > As for the group of persons willing to maintain this : well, for the
> moment, there is me. As this is a quite small toolkit, I think it's
> sufficient, at least for a start.
> >
> > I'll start working towards the other requirements (maven + test
> coverage) right away and let you know as soon as it's ready.
> >
> >
> >
> > Should I keep answering to the whole ML about this, or only to you?
> >
> > Best regards,
> >
> >
> > Valentin WAESELYNCK
> > Étudiant en 3° année à l'École Polytechnique
> > valentin.waeselynck@polytechnique.edu
> > +33 6 80 84 99 54
> >
> >
> >
> >
> > Le Jeudi 5 décembre 2013 8h53, Benedikt Ritter <britter@apache.org> a
> écrit :
> >
> > Bonjour Valentin,
> >
> >
>  welcome to the ML. Good to hear that you've decided to join the open
> source
> > movement.
> >
> > First of all, it would really help, if you could elaborate some use cases
> > for your library. You're talking about building algorithms. What kind of
> > algorithms can be build with Laboratory Toolkit? Can you give some code
> > examples (just create some gists at github that show the the use of
> > Laboratory Toolkit)?
> >
> > There is an important requirement for any code to be incorporated into
> the
> > Apache code base:
> > - the interlectual property (IP) of the code has to be owned completely
> by
> > the contributor. You said, that you've build the Laboratory Toolkit for a
> > research project. Are you sure that you own the code? Or
>  is it the result
> > of your work and thus is owned by your employer?
> >
> > At commons we have some additinal requirements:
> > - There should be a group of people who is willing to maintain the code
> > - Commons components should in general not depend on any other libraries
> > - Commons uses maven as the main build tool, so there should be a maven
> > build available
> > - The code should have a good test coverage
> >
> > You have to figure the IP issue out on your own first.
> > After that, if the community decides to accept this contribution, we can
> > work on the commons requirements.
> >
> > Best regards and thank you,
> > Benedikt
> >
> >
> >
> > 2013/12/4 Valentin Waeselynck <valentinwaeselynck@yahoo.fr>
> >
> >>   Hello to all,
> >>
> >> As part of a small research project (which combined techniques of
> >> text-mining, machine-learning and natural language generation, not that
> >> it's really relevant) I have come to design a small JavaSE library,
> which
> >> I'm for the moment calling the Laboratory Toolkit, for developing our
> >> algorithms in a comfortable and flexible manner.
> >>
> >> I have found it to be quite generic and reusable, not tied to any
> >> application domain, while still being rather accessible, and
>  small enough
> >> to comprehend it easily. Therefore, I would like to propose it as a new
> >> Apache Commons component. I would be very grateful if one of you could
> >> tell me what steps I should follow for that purpose.
> >>
> >> I have uploaded it on Github :
> >> https://github.com/vvvvalvalval/Laboratory-Toolkit.git. There you may
> >> find the sources, the javadoc, and a small guide I have started to write
> >> for it (also attached to this mail).
> >>
> >> Of course, I am very open to feedback and criticism on your behalf. The
> >> last thing I want is to publish an immature or useless component; nor
> do I
> >>
>  take a positive answer from you for granted.
> >>
> >> If I have failed to follow the proper procedure to propose a new
> candidate
> >> component, it is not on purpose, and I apologize in advance.
> >>
> >> Whatever your reply, and since I have the chance, I would also like to
> >> congratulate you for all your work. The Apache Commons components have
> >> really been lifesavers to me, on many occasions.
> >>
> >> With best wishes,
> >>
> >> Valentin WAESELYNCK
> >> Étudiant en 3° année à l'École Polytechnique
> >> valentin.waeselynck@polytechnique.edu
> >> +33 6 80 84 99 54
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> For additional commands, e-mail: dev-help@commons.apache.org
>
> >>
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message