lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Provalov, Ivan" <>
Subject Michigan Information Retrieval Enthusiasts Group Quarterly Meetup - May 19th 2011
Date Fri, 06 May 2011 15:33:38 GMT
Our next IR Meetup is at Cengage Learning on May 19, 2011.  Please RSVP here:


1. Bayesian Language Model

This talk presents a Bayesian language model, originally described by (Teh 2006), which uses
a hierarchical Pitman-Yor process to describe the distribution of n-grams in an n-gram language
model and which allows for a Bayesian back-off and smoothing strategy. The language model,
which assumes a power-law prior over the n-gram space, compares favorably with language models
based upon state of the art empirical n-gram smoothing techniques. In addition to the language
model, and primarily because the background information required to understand it is somewhat
difficult, that  material, most of which does not appear in (Teh 2006), is also presented
in some detail. In particular, background information related to the Dirichlet distribution
and the Dirichlet process is given. The Dirichlet process is then related to the Pitman-Yor
process, and the hierarchical Pitman-Yor process is also presented.

2. Using GATE for Word Polarity in Context Classification

GATE (General Architecture for Text Engineering) is an open source software for creating text
processing workflows.  Core GATE includes the tools for solving many text engineering issues:
modeling and persistence of specialized data structures; measurement, evaluation, benchmarking;
visualization and editing of annotations, ontologies, parse trees, etc.; extraction of training
instances for machine learning; pluggable machine learning implementations.  This tutorial
will show how to use GATE for advanced machine learning applications.  Detecting word polarity
in context will be used as an example to show some of the GATE features.  The tutorial project
is based on the latest sentiment analysis research, specifically the work by Theresa Wilson,
Janyce Wiebe, Paul Hoffmann "Recognizing Contextual Polarity: An Exploration of Features for
Phrase-Level Sentiment Analysis", 2009.  Using different features (words, part of speech,
negations, etc...) SVM classifier is trained and evaluated.

Thank you,

Ivan Provalov

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message