Semantic problems are subdivided into lexical, syntactic and discourse types. Although both lexical and semantic lexical problems involve single words or phrases, semantic problems are syncategoric rather than specific. Semantic problems in syntax occur when a construction's syntax is correct but its sense is ill-formed or ambiguous, or vice versa. Discourse semantic cases involve an utterance's discourse context and, because utterances of this type are both syntactically and semantically well-formed, describe language complexities rather than problems.
Some of these cases can be handled by existing processing methods while others remain intractable. A few examples are not problems at all but are included simply because they give insight into the nature of the English language. Most are borrowed or adapted from other sources (Hirst, 1987; Leech, 1981; Marcus, 1978; Milne, 1986; Pereira & Warren, 1980; Perrault, 1985; Rettig, 1988; Ross, 1967; Winograd, 1983). Discussion are preceded by examples to which reference may be made by a parenthesized index.
A sink is a -plumbing-fixture- noun as well as a verb that means to -disappear-underwater-. Syntactically ambiguous words can in addition be semantically ambiguous within a given category. As a noun, `club' is a homonym for both -bludgeoning-weapon- and -recreational-association-; within the latter sense, it is polysemous because it can mean both recreational-social-group and -recreational-building-. Taken as a set the examples suggest that syntactically or semantically ambiguous words are often common and short.
These words denote a man or a role (1), or a city or province (2). Denotational ambiguity can affect quantification; a baker is a general role but Baker is a specific person.
Respectively, these particles denote a name or a month, a degree or a state, and a province or a period of historical time. Abbreviations differ from words in being able to incorporate punctuation and capitalization which can help distinguish between their different meanings. These aids however may be missing, partial or nonstandard.
The first two words both appear to be nominalizations but no form of `aggression' exists to match the verb `oppress'. This phenomenon of a lack of a word to express a concept is called a lexical gap. Lexical gaps are generally understood to apply to easily understood concepts having collateral concepts which are realized in words.
People routinely encounter words which they don't know or which are meaningless.
Idioms come in all sizes--words, phrases and clauses. They can include coinages (1, 4) and proper names (3) and may have relatively short lifespans (4, 5). Idioms that survive long enough pass into the language permanently (6).
In addition to idioms based on nouns and verbs or those with substantial concrete sense, there exists a class of subidiomatic expressions which act as facilitators of expression or linguistic shortcuts. These expressions are usually based on modifiers (3) and may have co-occurring parts of speech (4). A subidiom's sense limited and is homologous to the sense of the modifier or the part with which it occurs.
This well-formed sentence is difficult to understand because encountering a subordinate clause forces a reader to suspend his partial understanding of the entire sentence. The possibility of confusing contexts arises as soon as two are stacked.
Garden path sentences employ words which can be understood as more than one part of speech. These categorically-ambiguous words appear both within clauses (3 & 5) and between clauses (1, 2 & 4). The punctuation and relative pronoun introducers which would unambiguously identify a clausal reading are elided. Taking the examples in order, `raced' can act as the head of an adjective phrase and as a past participle, `tunes' and `man' can be noun or verb, `cotton' noun or adjective, and `have' main or auxiliary verb in imperative or interrogative constructions.
Garden paths often contain intervening phrases or clauses which postpone a reader's revision of his misinterpretation and therefore encourage misunderstanding. Removing an intervening construction can sometimes eliminate an alternative reading and always makes the main clause easier to comprehend. When this is performed on the four examples of this sort, "Have the students take the supplementary", "The man tunes pianos" and "The cotton grows in Mississippi" immediately becomes unambiguous. "The horse raced fell" becomes ungrammatic; only the presence of the interjected prepositional phrase `past the barn door' allows the adjective to be postposed legitimately. After paraphrasing it becomes "The raced horse fell", which is unambiguous.
There is no formal difficulty in processing these sentences but in practice they become increasingly onerous to handle as they go on. It appears that we use sentence terminal punctuation as a signal to chunk our understanding. When a sentence ends, we know that we can stop watching out for an entire class of possible syntactic relationships redeploy our attention to the next series of words.
Only context can tell whether this phrase means `old men and old women', or `old men and women of unspecified age'.
The first sentence has three readings which depend on how the last prepositional phrase is interpreted. If `with a telescope' is read as accompaniment, it can modify `him' or `hill' and produce "he and a telescope were seen on the hill" or "he was seen on the hill which has a telescope on it". If the phrase is taken as identifying the instrument used for seeing, it produces "he was seen on the hill by use of a telescope".
The second and third sentences show how ambiguity arises when a sentence contains more than one phrase which can play more than one available role. Selecting one role for the first phrase automatically assigns the second to the other phrase. Thus either sentence can be interpreted as saying that a paper is being written by someone sitting at a table, or that a table is being painted or sanded on top of spread paper. `At' can be substituted for `on' anywhere.
Depending upon how the scope of the conjunctions is interpreted, this phrase can mean `either men and women, or men and boys', or `either men and women, or boys'.
The indirect object is inherently semantically ambiguous. It almost always expresses senses of -directed-towards- and on-behalf-of and is paraphrased as `to' +indobj or `for' +indobj. Although context within a sentence usually establishes what sense is intended, this example proves that it is possible to make ambiguous statements; it is not clear whether the letter was written to Mary or on her behalf.
The second sentence is an example of a sentence which cannot be paraphrased in either of the two standard ways. The `to' reading makes no sense. "We envy you for your wealth" makes sense but is not quite correct. Envying someone for wealth is not the same as wishing to possess the wealth itself, which is what the second example really means.
This suggests either that I have a picture belonging to John, or that I have a picture of him. Ambiguity arises if a form expresses two different senses by virtue of contraction, and disappears when the contraction is expanded. Thus "I have a picture of John's" and "I have a picture of John" each unambiguously express one of the example's readings.
This is essentially the same problem as lexical ambiguity in syntactic case. However these examples go beyond identifying the problem to demonstrate it. Each `girl moves' collocation in the first sentence is composed of different parts of speech; ?adj+noun, noun+noun, noun+verb. `Cooking' acts as a present participle and an adjective (2). The third example is a classic with five readings, only four of which I can identify: flies should be timed a certain way, a type of fly likes arrows, time passes quickly, and ?it's time that flies liked arrows.
A literal reading of either sentence asks the question: than he does what? Ellipsis within a sentence can frequently be identified by the presence of coordinating conjunctions or punctuation between an elliptical clause and its referent. Elliptical reference between sentences frequently relies on structural symmetry between the two related expressions.
Sentences using `respectively' explicitly or implicitly coordinate on a cross-serial basis (rather than on a center-embedding one). In cross-serial coordination, each conjunct in a series is linked with its counterpart in the same position in another series. The construction is confusing when the number of conjuncts does not agree. More than two series can be coordinated; (1) is still easily understood if `in 1989 and 1990' is appended.
Although anaphora usually involve references to nouns (`this'), they can also apply to adjectives (`such') or adverbs (`so'). Reference in the next example can only be determined by pragmatic knowledge of who was hungry. In terms of the last example, which hat was transferred in each of the following sentences: 1) "The girl gave hers to the woman"; 2) "The girl gave hers back to the woman"; 3) "The girl gave it back to the woman"?
Phrases which contain more than one noun in their head can be divided into two types. In multiple word noun phrases, the last noun supplies the dominant elements of the phrase's sense and each preceding noun specifies and extends it (1, 2, 4). Merged noun compounds are concatenations of nouns which give rise to a new sense largely unrelated to those of its components (3, 5).
This slightly unusual construction emphasizes the opening phrase. There are no problems in processing, particularly since the presence of structural punctuation signals the likelihood of distance dependency (between `had' and `for').
Ungrammatical utterances will always be with us. They are found particularly in speech, where the parallel visual and aural modes act to integrate the entire communication and to substitute for broken phraseology and incomplete expression. Ill-formed statements can be divided into ones which are completely but incorrectly formed and those which are much less complete than the usual elliptical statement.
Nominals become more awkward as the construction in which they appear becomes more complex. Ross attributes this awkwardness to the `nouniness' which a nominalized form acquires as a result of its transformation (Ross, 1967 in Winograd, 1983).
Transforming the first sentence into the second poses no problem, however executing a parallel transformation on the third statement generates an ill-formed sentence. The third sentence must be paraphrased to merge the subordinate clause into the main clause, generating "Barbara was pleased by Merry's book", before it can be transformed into an equivalent interrogative construction "Who was pleased by Merry's book?".
The first sentence is legitimate but a similar construction incorporating a reflexive relationship is ill-formed. The remaining sentences are paraphrases of the second statement that move toward correct syntax while retaining the reflexive element and the same sense. These examples seems to highlight an unmotivated complexity of English syntax.
To establish a context for evaluating these unusual verbs, consider the first two non-ergative sentences which are respectively transitive and intransitive. Truncation produces one satisfactory outcome, "The man is laughing" and a dubious one, ?"The man is opening" (unless a context has previously been established).
Now consider the next three sentences. Although the surface syntax of these examples suggests that `cook' is bitransitive, taking or leaving an object equally well, in reality the verb seems to have an inherent transitivity. Performing the same truncation on (3) to produce (4) changes its sense; (4) is ambiguous, its different senses being -cooking-something- and -being-cooked-. This change in meaning did not occur with the first two truncations.
Verbs of this sort are called ergative. They seem to have such a great need for an object that if none is supplied syntactically, the action can be interpreted as rebounding on the agent. A ergative verb can meaningfully fill the two frames "Something is <prpart>" and "Something is being <papart>"-it is unusual for the active and passive voices of a statement to express the same sense. Ergative sentences like (4) are therefore often ambiguous. Some instances require discourse context, while others can be disambiguated using only the features of the constituents. There is no ambiguity about (5) because -cooking-something- requires a human agent and `meat' is not a human being.
Taste', `smell' and other verbs describing perceptual processes are ergative. The mental sphere in particular the emotional is rife with instances of the sort of situation that ergative verbs express. Consider `fear' and `frighten'. Although both statements in the pair "He frightened me", "I feared him" are active, they refer to one act, not two . The factors operating here seem in some way to be akin to the two of the types of speech act, locutionary ("I feared him") and perlocutionary ("he frightened me"), generalized from cases of deliberate action to all acts.
The first set of examples use ergative verbs to illustrate homomorphy and the second set show that it is something other than ergation. (1) and (3) can be paraphrased without ambiguity as "Planes which are falling can be dangerous" and "<To be> buying planes can be dangerous". The second sentence can be read both ways: "Planes which are flying can be dangerous" and "<to be> flying planes can be dangerous". It seems to lack a constituent which would distinguish between the two senses. The paraphrases suggest that the confusion arises from the different senses of ergative verbs; "Planes which are being flown can be dangerous" is a near paraphrase of "<To be> flying planes can be dangerous".
The same distinction appears in the second example without the presence of ambiguity. A paraphrase of (4) makes `Warren' the direct object in "<it> is easy to please Warren." while in (5) `Warren' is a complemented subject because `easy to please' is an idiom meaning `easily pleased'. Are there any other constructions which employ complete transformations like from adj+inf to adv+part?
In the first example the correct reference for `he' must be found across three intervening sentences. Sentences frequently make references to neighboring or more remote statements and these references can easily be misunderstood. In the second example long range reference occurs within a sentence across a 24-word interjection between `who' and `did it".
References can also occur between noun and noun; we are expected to recognize entities which have been previously introduced into a narrative. Common noun references are harder to handle than a pronoun one because the latter obviously requires a referent whereas the former may be the introduction of a new actor. The problem is compounded when the reference is not direct but through a synonym.
All comparisons, whether comparative (1) or superlative (4), need to have their context specified. In the comparative case, this means identifying the particular individual (2) or set of individuals to which the subject is being compared (3). When the comparative evaluation is made against a set, the resulting statement is easily paraphrased into a superlative comparison. Thus (3) is a paraphrase of (4).
Each word in the first sentence executes a traditional syntactic function; `on' heads a prepositional phrase that identifies where the rotation takes place. In the second sentence, `turn on' is a idiom in which the particle `on' indicates the direction of rotation. (4) and (5) show how the same sentence can be modified to select between its literal and idiomatic senses.
There is a dilemma here. A parser which treats (2) as a verb+prep will misunderstand it, while always taking the longest possible string from input (ie idiomatic use) means that (1) will never be correctly recognized.
Sometimes the name of a role is intended to identify the individual who currently fill it, and sometimes the role itself. This parallels the distinction between a term's intension and its extension.
The first example illustrates the problem; it is uncertain whether the sought man is a particular individual or simply any man at all. The second and third examples demonstrate that both definite and indefinite quantifiers can experience this kind of ambiguity. `A cat' is .universal and `the tamarind' may be; `a man' however must be interpreted specifically because no one cannot hold all men. The fourth and fifth sentences show that quantificational ambiguity can arise in tensed verbs. Given that I am within a room, a man's seeing me must precede his leaving the room. Therefore `sees' can only be understood as the generic sees-on-a-regular-basis rather than the specific single-act-of-sight. Because both verbs in the fifth sentence are past tense, no temporal order is established and `saw' can refer to either quantification.
No general method of resolving quantificational ambiguity has been discovered. However if a quantified phrase is highly specified by definition or through context, it is reasonable to assume that reference is to a particular entity. Thus in the absence of contrary context (such as reading the classified ads) "I am looking for a red 1977 Mustang" suggests that the speaker seeks a particular car known to himself. Similarly "The man who saw me for three weeks at 9 o'clock left the room" suggests that the seeing is an act unrelated to the leaving.
Words can be both polysemous and homonymous. `Right' has homonymous senses of -righteous- and -right-handed-. Each of these senses is itself polysemous. When employed in "the right thing to do" and "do right", -righteous- has homologs of correct-under-the-circumstances and -absolutely-correct-. The homologs of -right-handed- are demonstated in `on the right' and `my right hand'. They are somewhere-to-the-right and on-the-right-side-of-my-body.
`Right' can also occupies different syntactic categories: "Might makes right" (noun), "I right the bottle" (verb), `the right stuff' (adjective) and "Do right" (adverb)
A subset of ambiguous words carry pairs of senses which are anotnyms. This phenomenon is known by a variety of names including autoantonymy. See discussion on the LINGUIST list, early 1995 onwards.
These sentences introduce ambiguity because `every' means both the universal quantifier `all' and the existential quantifier `some'. As a result the first example leaves it unclear whether the trains have to visit all cities or just one of them to be listed. The next examples show that negation can interact with quantifier ambiguity. `Not everyone' is existentially `not some one' and therefore someone, and universally `not all ones' and therefore no one.
In each of these examples structure has been subordinated to communication; each is readily understood although it infringes on the rules of grammar. The first example would be more correctly stated as "Everyone loves their wives", which would introduce a syntactic ambiguity about the number of wives each person has.
The reference for `it' in (2) is an elided `his win' phrase or `that he would win' clause. The presence of either makes an awkward construction: "Although it was predicted that he would win, Goldwater didn't (win)"; "Although his win was predicted, Goldwater didn't (win)". The third case is similar to the second. It can be paraphrased as "If a man deserves a prize and shows that he wants it, he will receive the prize" but this fails to express the need to show desire emphasized in (3).
In the first example (1) emphasizes the scarcity of voracious readers while its passive transformation (2) emphasizes the plenitude of poorly-read books. A sentence's first phrase seems to enjoy a position of importance. (1) and (2) make different statements because the focus of emphasis shifts from `books' to `readers' when voice is transformed from passive to active. Although the problem disappears when quantification is shifted to the neutral `some' in the second example (3, 4), `some' in this case appears to be a qualifier of identity rather than a quantifier of measure because the problem persists when the quantifiers are universal (5, 6).
(2) cannot be inferred from (1) because the first sentence does not explicitly say that Ross knows the breed of Nadia's dog. An individual's personal knowledge is necessarily circumscribed and it cannot be assumed that objective facts are subjectively known by everyone. Reference to subjective knowledge occurs in words like `believe', `say', `think' that report indirect discourse and mental activity.
Both of the first two examples contain ambiguous elements. `Ready to eat' can be interpreted as an idiom meaning -fully-prepared- or as a construction meaning -hungry-. In the second example, modifying `train' with `Boston' leaves it unclear whether the train is going to or has come from that city. These ambiguities arise from conventions which we routinely assume from the context when we read either statement. However the first example should not really be ambiguous at all. Its -fully-prepared- sense can only be understood by misreading it; to make that statement in correct grammar would be "The chickens are ready to be eaten". Advertising practice has made `ready to eat' mean ready-to-be-eaten.
The -to- or -from- confusion in the second example is based on a similar assumption, but one that has not been formally taught through the media. `Boston' in this case could also mean -made-in-Boston-, -named-after-Boston- or even -owned-by-the-Boston-subway-corporation-. The directional senses are selected as the ones to be confused about because movement and travel is the first thing most people associate with trains. Disambiguation comes (if it does) through knowledge of the timetable.
The five `human' examples all use the adjective to communicate different relationships between its object and human beings: offspring-of-humans, race-composed-of-humans, consumption-by-humans, experimentation-on-humans, dog- with-human-traits,. Similarly, although both words in the last example are individually unambiguous, when collocated it is unclear whether they mean meat-to-feed-to-the-dog or flesh-of-the-dog.
The twenty copula are a special case of relationship. They are used in a statement to allow an entity to be qualified in potentially elaborate manner and not to express a formal relationship. Notwithstanding this, these verbs do qualify their complement's .existence.
A reasonably exhaustive list of copulative verbs is: subjective complements: `be', `become', `grow', `get', `feel', `seem', `appear', `taste', `sound', `smell',`look', `remain', `act', `go', `turn', `make'; objective complements: `make', `prove', `imagine', `consider'.
The referents for `they' and `it' are supplied by convention and not context; none will be found in the narrative.
This example has elements of garden pathing and idiom, but the construction it is based on may be unique. In the first sentence `oscilloscope' is the direct object; in the second, it is an instrument and `radio' is the direct object. Winograd (1983) believes different senses of `used to' are employed. In the first sentence the phrase, which he represents as `useta', expresses past tense idiomatically; in the second, instrumentality.
We understand that `those three boys' refers to `kids' because the two noun phrases are highly synonymous (though not exactly so; little girls can be kids too). Less synonymous pairs and reference through negated antonyms are harder to understand.
Synonyms and long range together are a recipe for expensive processing in cases of reference. If `the man' or `Mr. Smith' is substituted for `he' in the last sentence of the second example, the previous four sentences or 15 words would need to be tested to catch the intended reference.
The first reported statement cannot be understood without context. In the second we infer that the car's windshield has been made opaque by cracks and stars, and that although the tires are not flat, part of the fender or frame is pressing against one of them.
Spoken words can identify things in the scene which are not mentioned in the utterance. The same effect can be accomplished in a written narrative only if the identified item is explicitly or implicitly introduced in the vicinity of the dialogue.
The content of these three sentences imply three speakers rather than two individuals taking turns which would otherwise have been assumed.
Written English does not identify speakers of directly-quoted dialogue once they have been introduced. Quotation marks are used to delimit utterances and speakers are assumed to take turns. This convention becomes problematic when a conversation involves more than two people; we then must rely on our models of the participants' interests, motivations and perspectives to determine attributions. There is a distinct risk of misinterpretation in a dialogue among three or more people.
Speech acts can be divided into those in which the locutionary sense is incongruous with the context and those in which it is congruous. The first case, illustrated by (1) covers what is commonly known by Austin's (1962) term. The locutor's interest in the auditor's physical capacity is irrelevant to his current activity of eating.
The second case is more contextual. In (2) all parties know that the woman is expressing a preference for more than a gift. Although the second utterance is superficially similar to a metonymy (in that the speech act consists of the flowers being understood to stand for the man who is offering them), it differs in that its sense is meant literally; no doubt the woman really expects to and does receive the flowers. Her utterance is therefore not solely figurative.
'Confidently' applies to the act of walking within the statement and elaborates its sense; `apparently' applies to the entire statement and qualifies the proposition it communicates. The former is a de re usage and the latter de dicto. A frame to distinguish between these two cases translates candidate words to their cognate noun form: "It is apparent that ...". Any statement that can be sensibly paraphrased in this way is de dicto.
Modal verbs are a closed set which qualify the possibility, capability and necessity of a specific act or class of acts occurring (1). In addition, modals qualify the volition, intention, prediction, permission and obligation of an animate agent to perform an act (2).
The subjunctive mood is used to express statements which have an imaginary or hypothetical existence (3).
In the absence of explicit connectors like `because', `however' and `since' (colloquial) we must employ commonsense knowledge to infer that causal consequence accompanies temporal sequence. The first example is open to either a necessary or an accidental interpretation. In my view it leans slightly towards causality because it goes to the trouble of mentioning the events. The second example is clearly to be understood causally. The third and fourth examples involves events, and a state and an event respectively. Readers are not likely to think that the apple's fall in either case proceeded from Tom's action or state.
After reading the above sentences, it is impossible to know which of the family members booed, left, began to talk among themselves or continued to watch the play. Language utterances routinely define groups and perform operations on them: 1) conjunction (`the gang and I'), 2) disjunction ("It's them or us"), 3) member and 4) subset identification and selection ("the whole family was among the audience ...", `one of the staff'. These opeations result in the specification of groups or individuatls whose identties are unknown. Language users trust that they will be told the composition of a group when this is important.