A Model of Semantic Processing

Edward Kenschaft
LING896 Minor Paper

Abstract

These are just some preliminary notes.  I have not yet dug into the related literature, which is considerable.

Semantic theory is still in its early childhood, in my opinion.  One reason for this is that it is difficult to formulate experiments to test semantic processing.  Another is that different subdisciplines within linguistics – theoretical, neuropsycho, computational – have not been particularly effective at learning from one another.

If semantics is in its childhood, computational semantics is in its infancy.  Although many researchers are investigating areas of computational semantics (???), their experiments are typically in one narrowly restricted area, and mostly ad hoc.

This project is an attempt to establish a model of semantic processing which is empirically plausible, computationally tractable, and learnable with minimal data.  While the scope will necessarily start out small, the intent is to design the model such that it can be incrementally enhanced as new phenomena are addressed.

The end result of the project will be a workable system, which we will evaluate for its utility in natural language processing (NLP) applications such as information retrieval (IR) and machine translation (MT).  We also expect the model to offer insights into how semantic processing is actually performed by humans.

Introduction

Nutshell

Our model assumes that the speaker and listener share, at least to a large extent, an ontology and a lexicon.  The meaning of a sentence is built up incrementally from the meanings of its constituents, both words and constructions, interpreted in context.

Research Outline

I'm considering the following broad objectives.

Theory

Psycholinguistics

We hold these truths to be self-evident, and empirically demonstrable.

  1. Language is learnable by all but the most cognitively impaired humans.
    1. The principles and practice of language use must be derivable from the innate state of the human organism in combination with readily available environmental input.
  2. Language is inherently underspecified.
    1. Language would be intractably cumbersome if each speaker had to enumerate every presupposition and implicature, and resolve every potential cause of vagueness or ambiguity.
  3. Language is useful for communication, with remarkably little misunderstanding, and without apparent effort in most cases.
    1. Interlocutors must be able to establish sufficient common ground to mutually resolve underspecificity.
    2. We must explain not only why communication is possible at all, but also why misunderstandings occur.
  4. Human language processing is performed in real time.
    1. The human brain is massively parallel.
  5. Language is left-to-right, both for production and comprehension.
    1. Language is basically spoken or signed, and only derivatively written.
    2. Spoken/signed language is linear in time.
  6. Human language processing is incremental.
    1. A speaker begins to speak as soon as the first part of the utterance is planned, and continues to plan later parts of the utterance.
    2. A listener processes immediately whatever part of the utterance has been heard, and attempts to predict the parts that are coming.
  7. All(???) content words of natural language depend on the context of discourse for their interpretation.

We will design our model to mimic these characteristics of human language as closely as possible.

Syntax

As much as possible, we will try to explain the behavior of language according to semantic principles that are otherwise required for reasons of interpretation.  For situations where syntactic restrictions seem unavoidable (e.g. Binding), we will simply reflect the facts as we understand them for English, without committing ourselves to any particular theoretical framework.

A syntactic theory would be needed to explain what possible constructions are available to the language learner.  We will not pursue this here.

Semantics

The semantic framework of our model assumes that the meaning of an utterance can be constructed compositionally from the meaning of its constituents.  These constituents include at least words and constructions (Goldberg 1995).  The mode of composition is uniformly conjunction (Pietrosky 2005).  Most, if not all, of language must be understood in the context of discourse.

Our treatment of quantification is novel according to the literature I have read so far, although I suspect I will find other authors who have had the same idea.

Previous Work

Smaranda Muresan

(Muresan et al. 2005) Lexicalized Well-Founded Grammars (LWFG). 

Plan

Model

We build on the components introduced in (Muresan et al. 2005).

Ontology

We take as a given that the speaker and listener have overlapping ontologies of real-world concepts, independent of lexical realizations.  We will not concern ourselves at this point with where these concepts come from, or how they are defined.  So, for instance, we will assume that the concept sky is available to the speaker, without requiring a precise definition.

The term concept will be used to refer to an element of an ontology.  A usable ontology must contain at least the elements person and time (c.f. Discourse Context).

We will include grammatical markers in our ontology, including such concepts as aspect, tense, and agreement.

For the first phase of our project, we will assume that the speaker and listener share the precise same ontology, and that this ontology is predefined and fixed.  These assumptions will be progressively relaxed in later phases.

Lexicon

We take as a given that the speaker and listener have overlapping lexicons, without concerning ourselves, for the moment, where these come from.  Each lexeme contains (at least) the following fields.

(#015) lexeme
form
basic orthographic representation
meaning  concept

Where an ontology could conceivably function crosslingually, a lexicon is necessary tied to the orthography of a specific language.

We will be most concerned with lexemes at the word or morpheme level, although acknowledging the possibility of multi-word lexemes.  We will also include constructions in the lexicon.

Index

An index is a placeholder for a participant in the discourse.  Individual predications will always be made of one or more indices.

Relation

A relation is a predicate of one or more indices and/or concepts.  The most common relation, isa, relates an index to a concept (#009).

(#009) isa relation
isa(i : index, c : concept) concept c is predicated of index i

Whenever a lexeme is encountered, it generates an isa relation.  If the index does not already exist, a new one is generated.  We represent this in a table of lexemes and resulting explicatures, where each row represents the next step in the derivation (#010).

(#010) brown cow
Lexeme Explicature Semantic Structure
a. brown i:isa(i,brown) i:isa(i,brown)
b. cow isa(i,cow) i:isa(i,brown) & isa(i,cow)

Since conceptual predicates are encoded as participants in an isa relation, the number of relations in our model will be quite small.

I'm wondering if the isa relation should be replaced with states and substates, much like events and subevents.

Quantification

Contrary to long tradition, we do not treat quantifiers as in formal logic.  For starters, we recognize quantifiers used in natural language, for example those in (#011).

(#011) Quantifiers
Quantifier Examples
exist a brown cow
some men
all all good men
everyone
def the students
the first watermelon-flavored cupcake
Bush
kind I like broccoli.
wh Which umbrella is yours?
most Most students drink coffee.
several Several large men entered the room.

I expect we will eventually end up breaking exist into three distinct quantifiers: exist, sm, and some (Milsark 1977).

Explain downward-entailing quantifiers???

Rather than using pseudo-logical notation, we introduce quantifiers in relation to an index, using the relation q (#012).

(#012) q relation
q(i : index, q : quantifier) index i is quantified by q

Some examples are in (#013-#017).

(#013) a brown cow
Lexeme
Explicature Semantic Structure
a. a i:q(i,exist) i:q(i,exist)
b. brown isa(i,brown) i:q(i,exist) & isa(i,brown)
c. cow isa(i,cow) i:q(i,exist) & isa(i,brown) & isa(i,cow)
(#014) several large men
Lexeme
Explicature Semantic Structure
a. several i:q(i,several) i:q(i,several)
b. large isa(i,large) i:q(i,several) & isa(i,large)
c. men isa(i,men) i:q(i,several) & isa(i,large) & isa(i,men)
(#016) everyone
Lexeme
Explicature Semantic Structure
a. everyone i:q(i,all)
isa(i,person)
i:q(i,all) & isa(i,person)
(#017) the students
Lexeme
Explicature Semantic Structure
a. the i:q(i,def) i:q(i,def)
b. students isa(i,student) i:q(i,def) & isa(i,student)

Note that we introduce indices and quantifiers in a restrictor configuration.  This will become crucial when we get to multiple quantification (???).

We do not require that an index be quantified, since there is no empirical reason to.  If it becomes desireable to do so, this would be implemented in conjunction with the Index Stack.

Proper Names

Various theorists (???) have observed that proper names appear in complementary distribution with definite noun phrases, and bear other characteristics in common with them.  Thus, Bush is interpreted roughly as "the person called 'Bush' who is most salient in the discourse context".  We follow this interpretation here (#018).

(#018) Bush
Lexeme
Explicature Semantic Structure
a. Bush i:q(i,def)
isa(i,Bush)
i:q(i,def) & isa(i,Bush)

We will come back to proper names when we discuss discourse context.

Events

A participant in the discourse (represented by an index) may be predicated as an event (or ev).  This refers to either an event or state, as discussed extensively in the literature (???).  An event is typically introduced by a verb (#019).

(#019) The boy coughed.
Lexeme
Explicature Semantic Structure
a. the i:q(i,def) i:q(i,def)
b. boy isa(i,boy) i:q(i,def) & isa(i,boy)
c. coughed e:isa(e,ev)
isa(e,cough)
i:q(i,def) & isa(i,boy) & e:isa(e,ev) & isa(e,cough)

Note in (#019) that the e index is not quantified.  Nothing in our model requires a quantifier.  As we proceed, we will examine arguments for or against introducing a default q(e,exist).

Constructions

It probably jumped out at you that (#019) does not indicate any relationship between the boy and coughed.  Following, e.g. (Goldberg 1995), we posit that this relationship is introduced by the syntactic construction joining the boy with coughed.  We define a new relation arg which takes an event and an index, and specifies the argument relationship between them (#021).

(#021) arg relation
arg(e : event, i : index, a : arg) index i participates in event e as argument n
arg [ext, int, int2]

We specify arg as external, internal, or internal2, because we anticipate that these are the only possible argument positions.  Eventually, we will use these argument relations to infer semantic roles (???), but for now the syntactic relationship is enough.

Note that arg can also be used with non-verb events.

You may also have wondered how the and boy knew to take the same index i.  This information is again provided by the construction which joins them.  This construction introduces the relation eq, which simply takes two indices and equates them (#024).

(#024) eq relation
eq(i1 : index, i2 : index) indices i1 and i2 are identical in reference and/or resolution

For simplicity of representation, we will resolve redundant indices as soon as they are equated.

We assume that we can do this without loss of information, although we will need to confirm this assumption when we get to issues of multiple quantification (???) and raising (???).

We now introduce a column to distinguish the surface form from the lexeme, where the construction is itself considered to be a lexeme (#025).

(#025) The boy coughed.
Form Lexeme
Explicature Semantic Structure
a. the the i1:q(i1,def) i1:q(i1,def)
b. boy boy i2:isa(i2,boy) i1:q(i1,def) & i2:isa(i2,boy)
c. the→boy eq(i1,i2) i1:q(i1,def) & isa(i1,boy)
d. coughed cough e:isa(e,ev)
isa(e,cough)
i1:q(i1,def) & isa(i1,boy) & e:isa(e,ev) & isa(e,cough)
e. (the→boy) → cough arg(e,i1,ext) i1:q(i1,def) & isa(i1,boy) & e:isa(e,ev) & isa(e,cough) & arg(e,i1,ext)

At this point, which constructions are available and how they work is a mystery.  Muresan et al. (2005) provide an elegant mechanism by which such constructions can be learned efficiently and accurately from relatively few annotated examples.

Index Stack (IS)

Indices which are currently being constructed/modified are kept on an index stack (IS).  Only the top (or rightmost) index on the stack can be modified.  The one exception is that the next item down on the stack can be attached as a dependency to the current element.  This allows left-branching dependencies.  This is illustrated in (#029), step e.

(#029) The boy coughed.
Form Lexeme
IS Explicature Semantic Structure
a. the the i1 i1:q(i1,def) i1:q(i1,def)
b. boy boy i1 i2 i2:isa(i2,boy) i1:q(i1,def) & i2:isa(i2,boy)
c. the→boy i1 eq(i1,i2) i1:q(i1,def) & isa(i1,boy)
d. coughed cough i1 e e:isa(e,ev)
isa(e,cough)
i1:q(i1,def) & isa(i1,boy) & e:isa(e,ev) & isa(e,cough)
e. (the→boy) → cough e arg(e,i1,ext) i1:q(i1,def) & isa(i1,boy) & e:isa(e,ev) & isa(e,cough) & arg(e,i1,ext)
f. [done] [empty]

If default quantification is desired, it would be implemented as the index is popped off the stack.

When an index is attached as a dependency in a construction, it is popped off the stack, and can no longer be modified.  An index must be attached as a dependency when it is popped off the stack, since otherwise it is no longer available for processing.  For now, we simply stipulate this for empirical reasons.  We hope that the human analog will be confirmed through psycholinguistic experiments.

Empirically, it is necessary to "spell out" the subject i1 – i.e. pop it off the stack – when e is added (step d, c.f. #028).  The theoretical justification for this is not immediately clear, although it may have to do with studies (???) that suggest the subject is outside the scope of the event.  For now, we simply stipulate it.

Subevents

Following an extensive tradition (e.g. Rappaport-Hovav & Levin 1998, Rothstein 2004, Pietrosky 2005), we posit that events often consist of multiple subevents in a complex phrase.  We introduce the sub relation (#030) to represent this relationship.

(#030) sub relation
sub(e1 : event, e2 : event) event e1 includes event e2 as a subevent

This is illustrated for a resultative construction in (#031) (Williams 2005).  Note that the arguments attach, not to the subevents pound or flat, but to the unnamed event which includes them both.  This higher event must be introduced before pound to allow Pat to attach as a dependency.

(#031) Pat pounded the cutlet flat.
Form Lexeme
IS Explicature Semantic Structure
a. Pat Pat i1 i1:q(i1,def)
isa(i1,Pat)
i1:q(i1,def) & isa(i1,Pat)
( ) i1 e1 e1:isa(e1,ev) i1:q(i1,def) & isa(i1,Pat) & e1:isa(e1,ev)
Pat→( ) e1 arg(e1,i1,ext) i1:q(i1,def) & isa(i1,Pat) & e1:isa(e1,ev) & arg(e1,i1,ext)
pounded pound e1 e2 e2:isa(e2,ev)
isa(e2,pound)
i1:q(i1,def) & isa(i1,Pat) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,pound)
(pound) e1 sub(e1,e2) i1:q(i1,def) & isa(i1,Pat) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,pound) & sub(e1,e2)
the the e1 i2 i2:q(i2,def) i1:q(i1,def) & isa(i1,Pat) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,pound) & sub(e1,e2) & i2:q(i2,def)
cutlet cutlet e1 i2 i3 isa(i3,cutlet) i1:q(i1,def) & isa(i1,Pat) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,pound) & sub(e1,e2) & i2:q(i2,def) & isa(i3,cutlet)
the→cutlet e1 i2 eq(i2,i3) i1:q(i1,def) & isa(i1,Pat) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,pound) & sub(e1,e2) & i2:q(i2,def) & isa(i2,cutlet)

(pound) ← (the cutlet) e1 arg(e1,i2,int) i1:q(i1,def) & isa(i1,Pat) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,pound) & sub(e1,e2) & i2:q(i2,def) & isa(i2,cutlet) & arg(e1,i2,int)
flat flat e1 e3 e3:isa(e3,ev)
isa(e3,flat)
i1:q(i1,def) & isa(i1,Pat) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,pound) & sub(e1,e2) & i2:q(i2,def) & isa(i2,cutlet) & arg(e1,i2,int) & e3:isa(e3,ev) & isa(e3,flat)
(pound flat) e1 sub(e1,e3) i1:q(i1,def) & isa(i1,Pat) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,pound) & sub(e1,e2) & i2:q(i2,def) & isa(i2,cutlet) & arg(e1,i2,int) & e3:isa(e3,ev) & isa(e3,flat) & sub(e1,e3)
[done] [empty]

"The one we call Pat and the cutlet were the external and internal arguments, respectively, of an event which consisted of a pounding and a flattening."

Note that at the time the word pound is encountered, the listener does not know whether the event is simple or complex.

Resultative PP's

Following (Kenschaft 2005), we will make the simplifying assumption that all prepositional path phrases designate complex events of abstract motion (#032).

(#032) The professor talked the students into a stupor.
Form Lexeme
IS Explicature Semantic Structure
a. the the i1 i1:q(i1,def) i1:q(i1,def)
professor professor i1 i2 i2:isa(i2,prof) i1:q(i1,def) & i2:isa(i2,prof)
the→professor i1 eq(i1,i2) i1:q(i1,def) & isa(i1,prof)
( ) i1 e1 e1:isa(e1,ev) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev)
(the professor) → ( ) e1 arg(e1,i1,ext) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext)
talked talk e1 e2 e2:isa(e2,ev)
isa(e2,talk)
i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk)
(talk) e1 sub(e1,e2) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2)
the the e1 i3 i3:q(i3,def) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def)
students students e1 i3 i4 i4:isa(i4,student) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & i4:isa(i4,student)
the→students e1 i3 eq(i3,i4) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student)
(talk) ← (the students) e1 arg(e1,i3,int) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int)
into into e1 e3 e3:isa(e3,ev)
isa(e3,into)
i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int) & e3:isa(e3,ev) & isa(e3,into)
a a e1 e3 i5 i5:q(i5,exist) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int) & e3:isa(e3,ev) & isa(e3,into) & i5:q(i5,exist)
stupor stupor e1 e3 i5 i6 i6:isa(i6,stupor) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int) & e3:isa(e3,ev) & isa(e3,into) & i5:q(i5,exist) & i6:isa(i6,stupor)
a→stupor e1 e3 i5 eq(i5,i5) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int) & e3:isa(e3,ev) & isa(e3,into) & i5:q(i5,exist) & isa(i5,stupor)
into ← (a stupor) e1 e3 arg(e3,i5,int) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int) & e3:isa(e3,ev) & isa(e3,into) & i5:q(i5,exist) & isa(i5,stupor) & arg(e3,i5,int)
(into a stupor) e1 sub(e1,e3) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int) & e3:isa(e3,ev) & isa(e3,into) & i5:q(i5,exist) & isa(i5,stupor) & arg(e3,i5,int) & sub(e1,e3)
[done] [empty]

"The professor and the students were the external and internal arguments, respectively, of an event which consisted of a talking and an (abstract) movement into a stupor."

Modification PP's

(#032) The professor from Oxford
Form Lexeme
IS Explicature Semantic Structure
a. the the i1 i1:q(i1,def) i1:q(i1,def)
professor professor i1 i2 i2:isa(i2,prof) i1:q(i1,def) & i2:isa(i2,prof)
the→professor i1 eq(i1,i2) i1:q(i1,def) & isa(i1,prof)
( ) i1 e1 e1:isa(e1,ev) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev)
(the professor) → ( ) e1 arg(e1,i1,ext) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext)
talked talk e1 e2 e2:isa(e2,ev)
isa(e2,talk)
i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk)
(talk) e1 sub(e1,e2) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2)
the the e1 i3 i3:q(i3,def) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def)
students students e1 i3 i4 i4:isa(i4,student) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & i4:isa(i4,student)
the→students e1 i3 eq(i3,i4) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student)
(talk) ← (the students) e1 arg(e1,i3,int) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int)
into into e1 e3 e3:isa(e3,ev)
isa(e3,into)
i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int) & e3:isa(e3,ev) & isa(e3,into)
a a e1 e3 i5 i5:q(i5,exist) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int) & e3:isa(e3,ev) & isa(e3,into) & i5:q(i5,exist)
stupor stupor e1 e3 i5 i6 i6:isa(i6,stupor) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int) & e3:isa(e3,ev) & isa(e3,into) & i5:q(i5,exist) & i6:isa(i6,stupor)
a→stupor e1 e3 i5 eq(i5,i5) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int) & e3:isa(e3,ev) & isa(e3,into) & i5:q(i5,exist) & isa(i5,stupor)
into ← (a stupor) e1 e3 arg(e3,i5,int) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int) & e3:isa(e3,ev) & isa(e3,into) & i5:q(i5,exist) & isa(i5,stupor) & arg(e3,i5,int)
(into a stupor) e1 sub(e1,e3) i1:q(i1,def) & isa(i1,prof) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev) & isa(e2,talk) & sub(e1,e2) & i3:q(i3,def) & isa(i3,student) & arg(e1,i3,int) & e3:isa(e3,ev) & isa(e3,into) & i5:q(i5,exist) & isa(i5,stupor) & arg(e3,i5,int) & sub(e1,e3)
[done] [empty]

The professor, from Oxford

Metaphor

.

Ambiguity

Ambiguity is prevalent in natural language.  Take the classic example (#027), with its two good paraphrases (#027a,b) and bad paraphrase (#027c).

(#027) The prince attacked the duke from Essex.
a. The prince attacked the duke who was from Essex.
b. The prince attacked the duke, and he did so from Essex.
c. * The prince, who was from Essex, attacked the duke.

We want to explain (or at least model) how (#027a,b) are good, and (#027c) is bad.  We do this by parsing the sentence up to the point where the ambiguity is introduced, and then building continuations in parallel (#028).

(#028) The prince attacked the duke from Essex.
(a) (b)
Form Lexeme
IS Explicature Semantic Structure Lexeme IS Explicature Semantic Structure
a. the the i1 i1:q(i1,def) i1:q(i1,def)
b. prince prince i2 i2:isa(i2,prince) i1:q(i1,def) & i2:isa(i2,prince)
the→prince i1 eq(i1,i2) i1:q(i1,def) & isa(i1,prince)
( ) i1 e1 e1:isa(e1,ev) i1:q(i1,def) & isa(i1,prince) & e1:isa(e1,ev)
(the prince) → ( ) e1 arg(e1,i1,ext) i1:q(i1,def) & isa(i1,prince) & e1:isa(e1,ev) & arg(e1,i1,ext)
attacked attack e1 e2 e2:isa(e2,ev)
isa(e1,attack)
i1:q(i1,def) & isa(i1,prince) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev)
isa(e1,attack)
(attack) e1 sub(e1,e2) i1:q(i1,def) & isa(i1,prince) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev)
isa(e1,attack) & sub(e1,e2)
the the e1 i3 i3:q(i3,def) i1:q(i1,def) & isa(i1,prince) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev)
isa(e1,attack) & sub(e1,e2) & i3:q(i3,def)
duke duke e1 i3 i4 i4:isa(i4,duke) i1:q(i1,def) & isa(i1,prince) & e1:isa(e1,ev) & arg(e1,i1,ext) & e2:isa(e2,ev)
isa(e1,attack) & sub(e1,e2) & i3:q(i3,def)
the→duke merge i3,i4 eq(i3,i4)
from from e2 e2:isa(e2,ev)
isa(e2,from)
Essex Essex i5 i5:q(i5,def)
i5:isa(i5,Essex)

All possible interpretations are constructed in parallel.  A derivation tree is abandoned when it leads to a contradiction.

In later phases, each derivation will have a probability assigned to it.  A derivation whose probability is below some threshold will not be pursued until all higher probability derivations are exhausted.  This simulates garden-path sentence understanding.

Discourse

Intuitively the boy in (#025) means something like 'the unique boy who is most salient in the current discourse context'.  Certainly, it does not mean 'the only boy in the whole world'.  An adequate model therefore requires a representation of the discourse context.  This requirement is even more obvious for indexical pronouns such as I and you.  We introduce a discourse (or disc) object to hold the contextual information necessary to understand an utterance (#026).  The person and time concepts are as defined in the ontology.

(#026) discourse object
speaker (s)
person
listener (u) person
time (t) time
context (c) context

The speaker (s) is the indexical returned by the pronoun I, the listener (u) is the indexical returned by the pronoun you, and the time (t) is whatever is returned by the word now.

Note that location is not an element of a discourse.  When needed, it can generally be derived from the speaker and time.

The context object is a catch-all for all other information that might be salient for the discourse.  This tells us, for instance, the relevant domain in which the boy is uniquely defined.

It might be considered cheating to include context as a field in the discourse.  After all, the speaker and listener each brings different background knowledge and perspective, and therefore each could be said to have a different context.  But really, the same could be said for the other discourse fields.  It is assumed that interlocutors recognize the same participants, but it could be they do not.  For instance, you could shout, "Stop it!" to your dog, and your son might cry because he thinks he is the listener in the discourse.

It is, in fact, the discourse object which requires multiple copies, one for each participant.  The communication succeeds to the extent that the interlocutors share common discourse representations.

For ease of exposition, we will act as if there is only one shared discourse, until it becomes important to distinguish between the speaker's and listener's points of view.

Binding

I want to go home.

Ted is easy/eager to please.

The men told the women to vote for each other.

Implementation

We plan to build on Smaranda Muresan's implementation of Lexicalized Well-Founded Grammars (LWFG) (Muresan et al. 2005).  Unless Muresan converts her framework to a probabilistic model, we will need to do that first.  Furthermore, we will need to change the implementation from bottom-up to left-corner.

Ontology

To begin with, we will use a predefined ontology with data, such as the LDC WordNet (and PropBank???).  Later, we will experiment with various alternatives generated through un- or semi-supervised means.

Various researchers (???) have experimented with ideas for generating a crosslingual ontology for a language pair of interest.  One possibility to try would be to use aligned morphemes to define a concept in the ontology

Lexicon

.

Training

.

Evaluation

.




The events and indices fields will be discussed under methodology.

World

A world refers to the entire knowledge, culture, environment and worldview of an interlocutor.  Thus, in our model, each person contains its own world.  Like the ontology and lexicon, we will assume that the speaker and listener have overlapping worlds.  Unlike ontology and lexicon, we will also note the differences between their worlds (???).

Indexicals are taken to refer to elements in the speaker's world.  For communication to be successful, the listener must be able to map these indexicals to appropriate elements in the listener's world.



(#001)
Lexeme
Explicature
a. every i0.0:[q(i0.0,all)]
b. he i0.0:[isa(i0.0,person) & isa(i0.0,male)]
c. someone i0.0:[q(i0.0,exist) & isa(i0.0,person)]


Methodology

The current methodology is exceedingly ad hoc, attempting to simply mimic human language processing, without any theoretical justification.  Presumably, it will change over time as more phenomena are addressed, and hopefully lead to a better motivated understanding.

Stack

This methodology uses modified stacks to store partially processed elements.  As each item is placed on a stack, it (partially) covers up items below it, making them temporarily inaccessible.  Once an item is popped off the stack, it can no longer be modified.  This means, among other things, that it must be quantified.

In other respects, the precise behavior varies for each type of element.

Discourse Stack (DS)

At the opening of a discourse, the understood discourse context d1 is placed on the discourse stack.  This provides the context for the rest of the communication.

Various operations, e.g. direct quotation, may generate a derivative discourse context, which is then placed on the stack.  The first derivative context is designated d2, the next d3, and so on.  The derivative context remains on the stack, typically until the end of the clause where it was generated.

The fields of a derivative discourse context are effectively invisible until set.  For example, the speaker and listener of d2 are the same as for d1 unless overridden.  Generally, only direct quotation overrides speaker and listener.

Each discourse contains its own event stack and index stack.  For ease of discussion, the first event index in discourse d1 will be designated e0.0, the next will be designated e0.1, and so on.  The first (non-event) index in discourse d1 will be designated i0.0, the next will be designated i0.1, and so on.

Event Stack (ES)

The event stack is the one element of the discourse which is always opaque.  An event is only visible in the discourse in which it is created.  Only the top event on the event stack may be modified.  When discourse di is closed (i.e. popped off the stack), all its events are closed first.

Index Stack (IS)

Items buried on the index stack are hidden from modification, but are available as antecedents.  A detailed treatment awaits further study.

Knowledge

Various pieces of knowledge enter into the derivation.  ???

Examples & Elaborations

For ease of exposition, each example will abstract away from features that have not yet been addressed.

(#003) I coughed.
Form Lexeme
DS ES IS Explicature Know Entail Semantic Structure
I I d1


i0.0
isa(d1,disc)
eq(i0.0,d1.s)

q(d1.s,def)

q(i0.0,def)
d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s) & q(i0.0,def)
coughed coughed e0.0 isa(e0.0,cough) d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s) & q(i0.0,def) & e0.0:isa(e0.0,cough)
I→coughed pop i0.0 arg(e0.0,i0.0,ext) d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s) & q(i0.0,def) & e0.0:isa(e0.0,cough) & arg(e0.0,i0.0,ext)

pop d1
pop e0.0 q(e0.0,exist)
q(d1,exist)
d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s) & q(i0.0,def) & e0.0:isa(e0.0,cough)arg(e0.0,i0.0,ext) & q(e0.0,exist) & q(d1,exist)
d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s) & q(i0.0,def) & e0.0:isa(e0.0,cough) & arg(e0.0,i0.0,ext) & q(e0.0,exist) & q(d1,exist)
d1:[q(d1,exist) & isa(d1,disc)] & i0.0:[q(i0.0,def) & eq(i0.0,d1.s)] & e0.0:[q(e0.0,exist) & isa(e0.0,cough)] & arg(e0.0,i0.0,ext)

.

(#002) I am hungry.
Form Lexeme
DS ES IS Explicature Semantic Structure
I I d1


i0.0
isa(d1,disc)
eq(i0.0,d1.s)
d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s)
am e0.0 isa(e0.0,be) d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s) & e0.0:isa(e0.0,be)

I→am
pop i0.0 q(i0.0,exist)

.

(#004) I think you love everyone.
Form Lexeme
Event stack Explicature Lexicon Entailment Semantic Structure
I I d1:isa(d1,disc)
i0.0
eq(i0.0,d1.s) d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s)

.

(#005) I think you love everyone.
Form Lexeme
Event stack Explicature Lexicon Entailment Semantic Structure
I I d1:isa(d1,disc)
i0.0
eq(i0.0,d1.s) d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s)

.

(#006) I think you love everyone.
Form Lexeme
Event stack Explicature Lexicon Entailment Semantic Structure
I I d1:isa(d1,disc)
i0.0
eq(i0.0,d1.s) d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s)

.

(#007) I think you love everyone.
Form Lexeme
Event stack Explicature Lexicon Entailment Semantic Structure
I I d1:isa(d1,disc)
i0.0
eq(i0.0,d1.s) d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s)

.

(#008) I think you love everyone.
Form Lexeme
Event stack Explicature Lexicon Entailment Semantic Structure
I I d1:isa(d1,disc)
i0.0
eq(i0.0,d1.s) d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s)
think think e0.0:isa(e0.0,ev) isa(e0.0,think) ∀e:isa(e,think) ∀i:ext(e,i) [ag(e,i)]
∀i:int(e,i) [theme(e,i)]

d1:isa(d1,disc) & i0.0:eq(i0.0,d1.s) & e0.0:isa(e0.0,ev) & isa(e0.0,think)
I→think pop i0.0 ∃i0.0
ext(e0.0,i0.0)
ag(e0.0,i0.0)
d1:isa(d1,disc) & ∃i0.0:eq(i0.0,d1.s) & e0.0:isa(e0.0,ev) & isa(e0.0,think) & ag(e0.0,i0.0)
you you d2:isa(d2,disc)
i1.0
eq(i1.0,d2.u) d1:isa(d1,disc) & ∃i0.0:eq(i0.0,d1.s) & e0.0:isa(e0.0,ev) & isa(e0.0,think) & ag(e0.0,i0.0) & d2:isa(d2,disc) & i1.0:eq(i1.0,d2.u)
love love e1.0:isa(e1.0,ev) isa(e1.0,love) ∀e:isa(e,love) ∀i:ext(e,i) [exp(e,i)]
∀i:int(e,i) [stim(e,i)]
d1:isa(d1,disc) & ∃i0.0:eq(i0.0,d1.s) & e0.0:isa(e0.0,ev) & isa(e0.0,think) & ag(e0.0,i0.0) & d2:isa(d2,disc) & i1.0:eq(i1.0,d2.u) & e1.0:isa(e1.0,ev) & isa(e1.0,love)
you→love pop i1.0 ∃i1.0
ext(e1.0,i1.0)
exp(e1.0,i1.0)
d1:isa(d1,disc) & ∃i0.0:eq(i0.0,d1.s) & e0.0:isa(e0.0,ev) & isa(e0.0,think) & ag(e0.0,i0.0) & d2:isa(d2,disc) & ∃i1.0:eq(i1.0,d2.u) & e1.0:isa(e1.0,ev) & isa(e1.0,love) & exp(e1.0,i1.0)
everyone everyone i1.1 2i1.1:isa(i1.1,person) d1:isa(d1,disc) & ∃i0.0:eq(i0.0,d1.s) & e0.0:isa(e0.0,ev) & isa(e0.0,think) & ag(e0.0,i0.0) & d2:isa(d2,disc) & ∃i1.0:eq(i1.0,d2.u) & e1.0:isa(e1.0,ev) & isa(e1.0,love) & exp(e1.0,i1.0) & ∀2i1.1:isa(i1.1,person)
love←everyone pop i1.1 int(e1.0,i1.1) stim(e1.0,i1.1) d1:isa(d1,disc) & ∃i0.0:eq(i0.0,d1.s) & e0.0:isa(e0.0,ev) & isa(e0.0,think) & ag(e0.0,i0.0) & d2:isa(d2,disc) & ∃i1.0:eq(i1.0,d2.u) & e1.0:isa(e1.0,ev) & isa(e1.0,love) & exp(e1.0,i1.0) & ∀2i1.1:isa(i1.1,person) & stim(e1.0,i1.1)
think←love pop e1.0
∃e1.0
int(e0.0,e1.0)
theme(e0.0,e1.0) d1:isa(d1,disc) & ∃i0.0:eq(i0.0,d1.s) & e0.0:isa(e0.0,ev) & isa(e0.0,think) & ag(e0.0,i0.0) & d2:isa(d2,disc) & ∃i1.0:eq(i1.0,d2.u) & ∃e1.0:isa(e1.0,ev) & isa(e1.0,love) & exp(e1.0,i1.0) & ∀2i1.1:isa(i1.1,person) & stim(e1.0,i1.1) & theme(e0.0,e1.0)
pop d1
pop e0.0
pop d1
∃d1
∃e0.0
∃d1
∃d1:isa(d1,disc) & ∃i0.0:eq(i0.0,d1.s) & ∃e0.0:isa(e0.0,ev) & isa(e0.0,think) & ag(e0.0,i0.0) & ∃d2:isa(d2,disc) & ∃i1.0:eq(i1.0,d2.u) & ∃e1.0:isa(e1.0,ev) & isa(e1.0,love) & exp(e1.0,i1.0) & ∀2i1.1:isa(i1.1,person) & stim(e1.0,i1.1) & theme(e0.0,e1.0)

∃d1:isa(d1,disc) & ∃i0.0:eq(i0.0,d1.s) & ∃e0.0:isa(e0.0,ev) & isa(e0.0,think) & ag(e0.0,i0.0) & ∃d2:isa(d2,disc) & ∃i1.0:eq(i1.0,d2.u) & ∃e1.0:isa(e1.0,ev) & isa(e1.0,love) & exp(e1.0,i1.0) & ∀2i1.1:isa(i1.1,person) & stim(e1.0,i1.1) & theme(e0.0,e1.0)

∃d1:isa(d1,disc), ∃i0.0:eq(i0.0,d1.s), ∃e0.0:isa(e0.0,ev), ∃d2:isa(d2,disc), ∃i1.0:eq(i1.0,d2.u), ∃e1.0:isa(e1.0,ev), ∀2i1.1:isa(i1.1,person) [ isa(e0.0,think) & ag(e0.0,i0.0) & isa(e1.0,love) & exp(e1.0,i1.0) & stim(e1.0,i1.1) & theme(e0.0,e1.0) ]


Observations and Results

.

Conclusions

.


Future Work

  1. Extended theories of learning.
    1. ontology
    2. lexicon
  2. Examine relationship between atomic and multi-unit lexemes.
  3. Elaborate on behavior of indices as antecedents.

Quick Reference

(#022) Relations
eq(i1 : index, i2 : index) indices i1 and i2 are identical in reference and/or resolution
isa(i : index, c : concept) concept c is predicated of index i
q(i : index, q : quantifier) index i is quantified by q
arg(e : event, i : index, n : 1..3) index i participates in event e as argument n
sub(e1 : event, e2 : event) event e1 includes event e2 as a subevent
(#023) Quantifiers
Quantifier
Examples
exist a brown cow
some men
all all good men
everyone
def the students
the first watermelon-flavored cupcake
Bush
kind I like broccoli.
wh Which umbrella is yours?
most Most students drink coffee.
several Several large men entered the room.

Glossary

We use the following terminology.  All definitions are informal and imprecise.

ambiguity – possibility of alternative meanings, e.g. person is ambiguous between human and grammatical marking.

computational linguistics – the use of computers to model and process human language; a subdiscipline of linguistics and computer science.

concept – units of meaning in thought; a node in an ontology.

corpus – a body of text, typically written.

discourse – a series of related utterances exchanged between two or more interlocutors sharing relevantly similar contextual background.

explicature – that which is explicitly conveyed by the meaning of an utterance.

implicature – that which the speaker intends the listener to understand, other than what is explicitly encoded in the utterance, e.g. "It's cold in here" might imply "I want you to shut the window".

information retrieval (IR) – using computers to gather specific information of interest from a large corpus (such as the web); a subdiscipline of NLP.

interlocutor – a participant in a discourse, who may alternate acting as a speaker or a listener.

language – a spoken or signed means of symbolic communication.

lexeme – a mapping from a surface form to a meaning.

lexicon – a list of lexemes (c.f. Lexicon).

linguistics – the study of language.

listener – one who a speaker is addressing.

machine translation (MT) – using computers to translate text from one language to another; a subdiscipline of NLP.

morpheme – an atomic lexical unit of meaning (contrast word).

natural language processing (NLP) – synonym for computational linguistics.

ontology – a hierarchical representation of concepts, with arbitrary links between nodes (contrast taxonomy; c.f. Ontology).

pragmatics – the study of implicit meaning in language, including implicature and (possibly) elements of presupposition; a subdiscipline of linguistics (contrast syntax, semantics).

presupposition – that which must be understood in order to make sense of an utterance, e.g. "The king of France is bald" presupposes that there is a king of France.

psycholinguistics – the study of language processing in the human brain, a subsdiscipline of linguistics.

semantics – the study of explicit meaning in language, includingexplicature and (elements of) presupposition; a subdiscipline of linguistics (contrast syntax, pragmatics).

speaker – one who is generating an utterance (spoken or signed).

syntax – the study of the structure of language; a subdiscipline of linguistics (contrast semantics, pragmatics).

taxonomy – a hierarchical representation of concepts, where each node has links only to its parent and daughters (contrast ontology).

utterance – a unit of language production of unspecified size or structure.

vagueness – imprecise meaning, e.g. person is vague with regard to gender, age, and so on.

word – a conventionally recognized orthographic unit, typically delimited by spaces and/or punctuation (contrast morpheme).

References

See additional resources.

Goldberg, A. 1995. Constructions. Chicago: Chicago University Press.

Edward Kenschaft. 2005. The Event Structure of English Prepositional Resultatives. Unpublished paper. [html]

Smaranda Muresan (2005). Parsing Preserving Techniques in Grammar Induction. Technical Report CUCS-032-05, Columbia University, New York, NY. (pdf).

Smaranda Muresan, Tudor Muresan and Judith Klavans. Lexicalized Well-Founded Grammars: Learnability and Merging. Technical Report CUCS-027-05, Columbia University, New York, NY. (pdf)

Smaranda Muresan. Inducing Constraint-based Grammars using a Domain Ontology. Proceedings of the Nineth AAAI/SIGART Doctoral Consortium. San Jose, CA, July 2004. (pdf)

Smaranda Muresan, Tudor Muresan and Judith Klavans. Inducing Constraint-based Grammars from a Small Semantic Treebank AAAI Spring Symposium on Language Learning: An Interdisciplinary Perspective . Stanford University, March 2004. (pdf)

Pietrosky, Paul. 2005. Events and Semantic Architecture. Oxford: Oxford University Press.

Rappaport-Hovav, M. and B. Levin. 1998. Building Verb Meanings. In M. Butt and W. Geuder (eds.). The Projection of Arguments: Lexical and Compositional Factors, 97-134. Stanford: CSLI.

Rothstein, S. 2004. Structuring Events. Oxford: Blackwell Publishing.

Williams, Alexander. 2005. Complex Causatives and Verbal Valence. Doctoral Dissertation, University of Pennsylvania.