Re: on the meaning of 'word sense'
>Ted Dunning writes:
>> the fact is that most humans have very great difficulty performing
>> sense disambiguation. doesn't this seriously bring into whether the
>> task is pertinent to language processing?
>You might do me and other computational linguists a service by
>summarizing the psychological evidence you're referring to
>(and providing a reference or two to get started). What sorts of
It's obvious :-)
Here is the OED entry for COMPUTATION (to take a word from your first
1.a. The action or process of computing, reckoning or counting; a
method or system of reckoning; arithmetic or mathematical calculations.
b. A computed number or amount; a reckoning
2. Estimation, reckoning.
Now, these seem to overlap horribly, and it becomes a matter of
judgement whether one refers to sense 1 or sense 2.
I think one paraphrase of computational linguistics might be (and here
we get immediately into deep water, and I give my prefered one later).
"The application of digital computers to the study of structure
extraction from language;"
Which puts `computation' clearly into sense 1. Some people will
disagree, and say more generally that computers per se have nothing to
do with computers, but rather to the more general sense of reckoning
as in the (somewhat droll) sentence:
"Of course, in one sense any 5-year-old child from any era of human
existence is a better computational linguist than any academic in the
world today since they have already internalised the structure in some
language sufficiently to be able to both produce and understand it."
Either sense 2 or sense 1 can apply to the child in this case; sense 1
if we are viewing the brain as a computational device, or as the more
general sense 2 "reckoner" of language.
Equally, we can get a problem with LINGUIST. The OED has four entries
for this word:
1. One skilled in the use of languages; one who is skilled in the use
of other tongues besides his own.
2. A student of language; a philologist.
3. An interpreter (obs except in China).
4. One who uses his tongue freely or knows how to talk.
Now, the child refered to above might be interpreted as a sense 4
linguist, while the academic a sense 2 linguist. Or might we be
talking about the child as a (sense 2) student of language to get a
parallel with academic computational linguists? I think we can say
that "computational linguist" used as above can be argued to give rise
to any of the four sense combinations (with possibly varying levels of
The point of the example sentence was to draw a parallel between the
explicit study of language on digital computers (such as what PhD
theses might be about) and the more mysterious study of language and
associated "reckoning" done by all of us. You might disagree about my
interpretations of the senses of the words used, but these
disagreements are unresolvable, I claim, by rational argument.
Moreover, the precise sense of "computation" and "linguist" are
entirely irrelevant here as well as being unresolvable; I chose the
phrase `computational linguist' here not because of some generative
meaning, but rather to make a point (which I might be able to analyse
post-hoc), and it might even have been rhetorical. Even if it were
underlyingly a generative combination of words (from parts of my brain
to which I have no introspective access), further analysis seems
>distinctions do humans find difficult to make? --Surely not all
>lexical-semantic distinctions, or we'd never understand much of
I think we might understand most of everything.
I understand "computational linguist" in its academic sense without
ever having (seriously) considered the meaning of the parts.
Moreover, I would say it has come to mean something (in an academic
sense) which you will find in no dictionary:
Computational linguist (n)
One who produces or has published, would like to publish, or is
interested in studying or furthering work which has been published in
Computational Linguistics, the proceedings of the ACL, [or a number of
other journals and conferences]; one who intellectually associates
themselves with those who have.
This is the (de)meaning of "computational linguist" I hear at IR
conferences :-), and is even possibly empirically derivable from
knowing who is called a computational linguist, and who they refer to
in their papers and which papers have been published in these
conferences and journals.
I use this definition to predict what someone is likely to know, what
they are likely to be interested in, what words and concepts I can use
when talking to them, whether they are likely to be able to help me,
whether I should read what they have written, and it is far more
informative than productively combining word senses, and consequently
makes sense disambiguation here effectively useless and more akin to
an etymological analysis than a useful semantic one. Of course some
will say this is just pragmatics, and an entirely contingent
definition, but what's more useful for practical tasks? Others might
say that `computational linguistics' has become an `idiom', but then
there are surely a lot of `idioms' out there...
Clearly there may be some circumstances in which sense disambiguation
does matter (e.g. the old saws about "saw" and "bank"), but quite
where and why it does is a complete mystery to me, and may possibly
vary greatly from application to application. Whether it is a well
defined problem in its own right is an open question; I think that it
stems from reading too many dictionaries :-)
Just for fun,
When you steal from one person, it's called Plagiarism-
When you steal from many, it's research. - Wilson Misner
Steven Finch | University of Edinburgh
Phone: +44 131 650 4656 | Language Technology Group
| Human Communication Research Centre
email: S.Finch@ed.ac.uk | 2 Buccleuch Place
| Edinburgh EH8 9LW