Re: WORD SENSE DISAMBIGUATION SURVEY: SUMMARY OF RESPONSES
To: Adam Kilgarriff <firstname.lastname@example.org>
Subject: Re: WORD SENSE DISAMBIGUATION SURVEY: SUMMARY OF RESPONSES
From: email@example.com (Mark Sanderson)
Date: Fri, 30 Jun 1995 10:04:45 +0000
Cc: firstname.lastname@example.org, email@example.com
Old-Received: from nora.hd.uib.no by alfie.uib.no with SMTP (PP) id <firstname.lastname@example.org>; Fri, 30 Jun 1995 11:04:27 +0200
Old-Received: from vanuata.dcs.gla.ac.uk by nora.hd.uib.no with SMTP id AA17898 (5.65c8/IDA-1.4.4 for <email@example.com>); Fri, 30 Jun 1995 11:08:32 +0200
Old-Received: from [18.104.22.168] (actually iona.dcs.gla.ac.uk) by vanuata.dcs.gla.ac.uk with LOCAL SMTP (PP); Fri, 30 Jun 1995 10:03:33 +0100
Resent-Date: Fri, 30 Jun 1995 11:04:41 +0200
At 11:46 am 22/6/95, Adam Kilgarriff wrote:
>Sanderson's conclusions were similar: WSD is probably only
>relevant where the query is very short, and WSD errors may actually
>degrade performance. Schutze and Pedersen, on the other hand, found
>that the performance of their system was improved by 7-14% over a
>'baseline' system by the addition of a disambiguation module.
I think I've figured out what is causing this difference in conclusions
about the utility of disambiguation to IR.
My conclusions were based on a simulation of the effects of disambiguation
on retrieval performance. One of the assumptions of the simulation was
that a word's senses were defined in some reference work such as a
dictionary or thesaurus.
Schutze and Pedersen's disambiguator however didn't use any reference works
to define senses. Instead their system looked for clusters of common
contexts surrounding each word to be disambiguated.
To take a simplified example of their technique, if their system was
looking at the word 'ball' as it appears in a certain document collection,
their system might notice that this word occurs
x times in the context of the words 'Graf', 'tennis', 'serve'
y times in the context of the words 'rugby', 'Australia', 'try'
z times in the context of the words 'gown', 'evening', 'party'
Each one of these common contexts is regarded as a sense, and
disambiguation of the word 'ball' involves assigning each occurrence of
this word to one of these identified contexts.
Without going into any great detail, the frequency distribution of the
senses I was dealing with appears to be different from the frequency
distribution of Schutze and Pedersen's senses, and this is causing the
difference in results.
I have recently run some experiments which confirm that the frequency
distributions I was using in my simulations were a reasonable simulation of
dictionary/thesaurus defined senses. I wish I had more time to explain
this here, but these experiments will appear in my thesis, if anyone is
Mark Sanderson, Fax : +44 (0)141 330 4913
Department of Computing Science, Tel : +44 (0)141 330 6292
University of Glasgow, Email: firstname.lastname@example.org
Glasgow G12 8QQ, UK URL : http://www.dcs.gla.ac.uk/~sanderso/
Good judgement comes from experience, but experience comes from bad