uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Grivolla <j+...@grivolla.net>
Subject Re: CFP: Workshop on Open Infrastructures and Analysis Frameworks for HLT
Date Fri, 15 Aug 2014 05:56:47 GMT
The workshop program, along with links to the full papers, is now
available: http://glicom.upf.edu/OIAF4HLT/Program.html

I'm looking forward to seeing many of you there.  I'll be staying at DCU
(College Park).

-- Jens


On Tue, Jul 1, 2014 at 6:52 PM, Jens Grivolla <j+asf@grivolla.net> wrote:

> The list of accepted papers is now available:
> http://glicom.upf.edu/OIAF4HLT/Papers.html
>
> For anybody interested in attending the workshop and COLING, please
> remember that the early registration deadline is tomorrow, July 2nd.
>
> Looking forward to seeing many of you there...
>
> -- Jens
>
>
> On Wed, Mar 26, 2014 at 2:34 PM, Jens Grivolla <j+asf@grivolla.net> wrote:
>
>> Workshop on Open Infrastructures and Analysis Frameworks for HLT
>> ================================================================
>>
>> http://glicom.upf.edu/OIAF4HLT/
>>
>> At the 25th International Conference on Computational Linguistics (COLING
>> 2014)
>> Helix Conference Centre at Dublin City University (DCU)
>> 23-29 August 2014
>>
>> Description
>> -----------
>>
>> Recent advances in digital storage and networking, coupled with the
>> extension of human language technologies (HLT) into ever broader areas and
>> the persistence of difficulties in software portability, have led to an
>> increased focus on development and deployment of web-based infrastructures
>> that allow users to access tools and other resources and combine them to
>> create novel solutions that can be efficiently composed, tuned, evaluated,
>> disseminated and consumed. This in turn engenders collaborative development
>> and deployment among individuals and teams across the globe. It also
>> increases the need for robust, widely available evaluation methods and
>> tools, means to achieve interoperability of software and data from diverse
>> sources, means to handle licensing for limited access resources distributed
>> over the web, and, perhaps crucially, the need to develop strategies for
>> multi-site collaborative work.
>>
>> For many decades, NLP has suffered from low software engineering
>> standards causing a limited degree of re-usability of code and
>> interoperability of different modules within larger NLP systems. While this
>> did not really hamper success in limited task areas (such as implementing a
>> parser), it caused serious problems for building complex integrated
>> software systems, e.g., for information extraction or machine translation.
>> This lack of integration has led to duplicated software development,
>> work-arounds for programs written in different (versions of) programming
>> languages, and ad-hoc tweaking of interfaces between modules developed at
>> different sites.
>>
>> In recent years, two main frameworks, UIMA and GATE, have emerged that
>> aim to allow the easy integration of varied tools through common type
>> systems and standardized communication methods for components analysing
>> unstructured textual information, such as natural language. Both frameworks
>> offer a solid processing infrastructure that allows developers to
>> concentrate on the implementation of the actual analytics components. An
>> increasing number of members of the NLP community have adopted one of these
>> frameworks as a platform for facilitating the creation of reusable NLP
>> components that can be assembled to address different NLP tasks depending
>> on their order, combination and configuration. Analysis frameworks also
>> reduce the problem of reproducibility of NLP results by formalising
>> solution composition and making language processing tools shareable.
>>
>> Very recently, several efforts have been devoted to the development of
>> web service platforms for NLP. These platforms exploit the growing number
>> of web-based tools and services available for tasks related to HLT,
>> including corpus annotation, configuration and execution of NLP pipelines,
>> and evaluation of results and automatic parameter tuning. These platforms
>> can also integrate modules and pipelines from existing frameworks such as
>> UIMA and GATE, in order to achieve interoperability with a wide variety of
>> modules from different sources.
>>
>> Many of the issues and challenges surrounding these developments have
>> been addressed individually in particular projects and workshops, but there
>> are ramifications that cut across all of them. We therefore feel that this
>> is the moment to bring together participants representing the range of
>> interests that comprise the comprehensive picture for community-driven,
>> distributed, collaborative, web-based development and use for language
>> processing software and resources. This includes those engaged in
>> development of infrastructures for HLT as well as those who will use these
>> services and infrastructures, especially for multi-site collaborative work.
>>
>>
>> ### Workshop Objectives
>>
>> The overall goal of this workshop is to provide a forum for discussion of
>> the requirements for an envisaged open “global laboratory” for HLT research
>> and development and establish the basis of a community effort to develop
>> and support it. To this end, the workshop will include both presentations
>> addressing the issues and challenges of developing, deploying, and using
>> the global laboratory for distributed and collaborative efforts and
>> discussion that will identify next steps for moving forward, fostering
>> community-wide awareness, and establishing and encouraging communication
>> among the various players.
>>
>> It aims at bringing together members of the NLP community specifically
>> users, developers or providers of components and tools for these frameworks
>> in order to explore and discuss the opportunities and challenges in using
>> such platforms for modern, well-engineered NLP applications.
>>
>> The challenge of creating reusable and interoperable components raises
>> particular interest and are affected by legal issues, such as potentially
>> incompatible licenses of components and tools as well as the technical
>> aspects of packaging and distribution of components. Also, tools are
>> important, for example to assemble complex processing pipelines, to manage
>> the bodies of data that are to be analysed and to visualize, explore, and
>> further deploy the analysis results. Further challenges are involved in
>> embedding framework based analysis within applications or using it in
>> distributed computing scenarios, such as deployment of and access to
>> required resources. Finally, the preservation of analysis results, their
>> provenance and reproducibility are of particular interest to the scientific
>> user community.
>>
>> ### Topics
>>
>> Workshop topics include, but are not limited to:
>>
>> - processing of very large data collections: scale-out, parallelization,
>> and performance optimization
>> - advanced applications driven by an NLP framework
>> - sophisticated tools to build and manage complex processing pipelines
>> - analysis of results: exploration, evaluation, visualization, and
>> statistical analysis
>> - experience reports combining components from different sources, as well
>> as solutions to interoperability issues
>> - experience reports combining different frameworks (e.g.
>> GATE/UIMA/WebLicht/etc.)
>> - UIMA components with a special focus on genericity and type-system
>> independence
>> - repositories of ready-to-use components for UIMA and/or GATE
>> - distribution of components: documentation, licensing and packaging
>> - developing for UIMA or GATE: simplified APIs, debugging, unit testing,
>> and limitations of the frameworks
>> - combining annotation type systems in processing frameworks (GATE, UIMA,
>> etc.) with standardization efforts, such as done in the ISO TC37/SC4 or TEI
>> contexts.
>> - use of NLP frameworks in real-world "industry" settings
>> - reports on current projects and frameworks, their challenges and
>> proposed or implemented solutions, including efforts to address
>> interoperability
>> - issues and challenges of multi-site collaborative projects, including
>> reports of implemented or proposed strategies
>> - pipeline management, including authentication, strategies for passing
>> resources through disparate tools and across hosting nodes, and licensing
>> - development and use of evaluation environments that facilitate
>> assessment of HLT component performance, iterative application development,
>> and replication of results
>> - community awareness and implementation of open infrastructures,
>> including how to engage the community, establish confidence in the process,
>> and promote use
>>
>> Dates
>> -----
>> Paper Submission Deadline: 2nd May 2014
>> Author Notification Deadline: 6th June 2014
>> Camera-Ready Paper Deadline: 27th June 2014
>> Workshop: 23rd August 2014
>>
>> Organisers
>> ----------
>> Nancy Ide
>> Department of Computer Science, Vassar College
>>
>> James Pustejovsky
>> Department of Computer Science, Brandeis University
>>
>> Eric Nyberg
>> Language Technologies Institute, School of Computer Science, Carnegie
>> Mellon University
>>
>> Christopher Cieri
>> Linguistic Data Consortium, University of Pennsylvania
>>
>> Jonathan Wright
>> Linguistic Data Consortium, University of Pennsylvania
>>
>> Jens Grivolla
>> GLiCom, Universitat Pompeu Fabra
>>
>> Kalina Bontcheva
>> Department of Computer Science, University of Sheffield
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message