A Summary of Research on Text Understanding
The following is a collection of summaries on technical reports relating to Text Understanding.
Alshawi & Crouch
- Bateman et al, Hovy
Report: A General Organization of Knowledge for Natural Language Processing: the PENMAN Upper Model focus on text generation
- Has a refined ontology.
- Bateman now works in GDI(?) Darmstadt, Germany, on text generation.
System: QLF (Quasi Logical Form)
- Monotonic quantifier scoping and reference resolution
System: ANA-EBL (Analogical Abductivre Explanation Based Learning )
- Examples provide plausible explanations/instantiations of underspecified concepts. Domain is bidding in bridge (=highly constrained)
- A Message Processing System with Object-Oriented Semantics, COLING 90, Helsinki 1990,vol III pp333-335
- Interpretation of naval reports to update the representation of the situation. text is analyzed with a simple DCG, and coded in frames in an object-oriented language, Objlog. Inference is performed to update the changes in the representation of the changing situation. Inference, some with demon
s, is applied to update the situation.
Gianetti et al
System: SEXTANT (Semantic Extraction from Text via Analyzed Networks of Terms)
- SIGIR 92: Use of Syntactic Context to Produce Term Association Lists for text Retrieval
- ACL 92: sextant: Exploring Unexplored Contexts for Semantic extraction from Syntactic Analysis
Hayes, Philip & Nirenberg, Irene
- Szpak was not impressed with the scope of the author's undertaking. He also drew our attention to a number of editing errors or inconsistencies that suggest the paper was written hurriedly. Stan focused on the (unexplained)mention of verb generalization, and we all discussed the important sem
antic differences between the two example sentences on p.125, 'birthday' and 'born on', which appear not to be well-chosen. LFG is a type of semantic grammar, one which incorporates semantically meaningful terms as non-terminals. Notwithstanding these criticisms, the paper discusses issues that a
re of interest to TANKA/MaLTe and, as it was written in 1990-1991, it behooves us to see if Hauptmann has made progress since then. Hauptmann has abandoned the line of investigation presented in the 'Syntax to Meaning' paper.
- 2nd Applied NLP Conf: Improved Portability and Parsing Through Interactive Acquisition of Semantic Information
System: SPQR (Selectional Patterns, Queries & Responses)
Appelt, Douglas et al
- Robust Processing of Real-World Natural-Language Texts
Hobbs, EACL '94: parsing as abduction (where the criteria for success is the most economical representation abduced)
Jacobs Paul & Rau, Lisa
- Jacobs & Rau, Joining Statistics with NLP for Text Categorization
various Jacobs or Jacobs & Rau in Coling and ACL - refer to Jacobs in Artificial Intelligence Journal, Oct 93.
- Zernik, COLING 92: Closed Yesterday and Closed Minds: Asking the Right Questions of the Corpus to Distinguish Thematic from Sentential Relations.
- Zernik, IJCAI 89: Lexicon Acquisition: Learning from Corpus by Capitalizing on Lexical Categoriesładdresses lexical gaps in TRUMP's processing.
- Jacob's TRUMP system employs frame semantics and multiple processing strategies, moving to inexact semantic matching if parsing fails. Also affected by MIC's priorities.
Lange & Wharton
- Coherence relations
- Paper in IJCAI 93
System: DICE (Discourse and Common Sense Entailment)
- "We investigate various contextual effects on text interpretation, and account for them by providing contextual constraints in a logical theory of text interpretation. On the basis of the way these constraints interact with the other knowledge sources, we draw more general conclusions about th
e role of domain-specific information, top-down and bottom-up discourse theory, and the usefulness of formalization in discourse theory."
University of Massachussetts at Amherst ??
Asher & Oberlander
- Claire cardie in ML workshop 93 on automatic acquisition of semantic relations
- Text understanding a la MUC
Moldovan Kim & Dan
Report: Acquisition of Semantic Patterns for Information extraction from Corpora
Mooney & Zelle
Report: Mooney, R.J. A General Explanation-Based Learning mechanism and its Application to Narrative Understanding, Morgan, Kaufmann, 1990.
Moulin & Rousseau
- A model of abductive inference for plan recognition (EGGS) applied to narrative interpretation (GENESIS).
- National Building Code is corpus. Szpak noted that the language processing this system performs isn't parsing. Peter remarked on its identification of logical/causal structure.
- Cao remarks there is no reasoning possible because only shallow knowledge is represented (fact, not rules).
Plante et al
- Palmer's MUC-influenced project Kernel. Descendant of Pundit with more reasoning capabilities.
- Originally Pundit (Palmer Passonneau Hirshman Lang) and Proteus (Grishman, at NYU) are a pair of DARPA-funded projects on message understanding. Pundit officially works on mechanical troubleshooting reports and Proteus on on-the-field military reports.
- Linguistically, Pundit's analyzer is Sager's Linguistic String Project grammar, and has a good linguistic finesse (nominalization, complex reference, rule-pruning, case-frame acquisition, etc.).
Schmidt & Putz
- Postejovsky claims that a lexical entry is in fact a concept which cannot be used without selectivity and inference. Entries in his 'generative lexicon' are coded with four features **
Some (vague) similarities with Small & Rieger's Word Expert Parser.
Schank et al
- "Knowledge Acquisition and Representation for Document
Structure Recognition: the CAROL Project"
- Text structure (organization and also typography).
- Michael Liebowitz,Janet Kolodner. A group of works based on:
- conceptual dependency (early)
- frame theory: various semantic frames as scripts, plans, MOPs,. TOPs
- and a general don't-care-about-syntax spirit
- Schank is not heading an 'Institute for the Learning Sciences' funded by Andersen consultants and is mostly dealing with computer-assisted instruction. In the meantime his ideas have diffused, though not necessarily in their original form, and not necessarily everywhere.
- Michael Lebowitz in Schank and Riesbeck 1989, Inside Computer Understanding
- Janet Kolodner
- Simple application of conceptual dependency.
- A modified version has been in use at several banks to process formatted telexes.
- Understanding with semantic frames
- Janet Kolodner
- Application of an ontology to newstory comprehension (domain: diplomatic visits)
- Coupled to FRUMP analyzer into a system called CYFR
- In Schank and Riesbeck's Inside Case-Based Reasoning, 1991
- Artificial Intelligence and Information Science Research, Bell Communications Research, Morristown NJ
- Based on Amsler's research on structure and possible use of an electronic dictionary (was Longman's LDOCE).
- The Use of Machine Readable Dictionaries in Sublanguage Analysis
- Tipster is a MUC competitor.
- The Diderot language-independent, domain-dependent, lexically-based semantic analyzer, a component of the Tipster system. Diderot uses a pre-processed version of the Longmans dictionary called Lexbase to initiate a 7-step process that identifies paragraphs discussing a preset domain. The seve
n steps in their order of execution are: lexical construction, semantic tagging, POS tagging, paragraph relevance determination, data structure merging and NP recognition, 'parsing', reference resolution, and template formatting.
- Illustrates the application of Schankian frames to understanding plans in narratives.
- Paper in Coling 93
Send problems and suggestions to:
Last Updated: 28 Mar 1995