January 22, 1999

Yllias Chali
School of Information Technology and Engineering
University of Ottawa

Lexical Chains as an Indicator of the Text Segment Topic

Lexical cohesion is a device for creating unity in text, it arises from the semantic relationship between words. We investigate a technique relying on a model of the topic progression in the multi-paragraph text, derived from lexical chains, and without requiring its full semantic interpretation. We present two algorithms for the computation of these chains: Roget's thesaurus based version and WordNet thesaurus based version. Lexical chaining proceeds in three steps: the original text is first segmented, select a set of candidate words, find the relatedness among the members of the chains, and build up the chain. Finally, we show the use of lexical chaining in the process of segment selection for the purpose of text summarization.

