a sequential approach of document description

Seminar by Antoine Doucet, University of Caen Lower Normandy (France), March 14, 2008, 10:30-11:30, SI-007.


The talk will focus on the presentation of a language- independent technique for the extraction and selection of multiword units. The main idea is to focus on the sequential nature of text and see multiword units extraction as an extension of sequential pattern mining. If time permits, the problem of the use of phrases in information retrieval will also be discussed.


Antoine Doucet is a tenure-tracked Associate Professor at the University of Caen Lower Normandy (France) since Fall 2007. He holds a Ph.D. in computer science from the University of Helsinki (Finland). His research interests are information retrieval (in particular book search and structured information retrieval), natural language processing and text data mining with a focus on language- independent techniques.

