Abstract. In logic, the first representation of context as a formal object was by the philosopher C. S. Peirce; but for nearly eighty years, his treatment was unknown outside a small group of Peirce aficionados. In the 1980s, three different approaches led to related notions of context: Kamp's discourse representation theory; Barwise and Perry's situation semantics; and Sowa's conceptual graphs, which explicitly introduced Peirce's theories to the AI community. During the 1990s, John McCarthy and his students developed a closely related notion of context as a basis for organizing and partitioning knowledge bases. Each of the theories has distinctive, but complementary ideas that can enrich the others, but the relationships between them are far from clear. This paper discusses several approaches to the semantics of contexts and related notions in logic and model theory: the possible worlds of Leibniz, Kripke, and Montague; the model sets by Hintikka; the situations of Barwise and Perry; and the contexts by Peirce and McCarthy. It concludes with a formal theory of contexts that can support all the above as special cases.
Contents:
This paper is a revised merger of two publications by Sowa (1995) and Sowa (1997b), with some new material in Sections 6 and 7. For related background on this topic, see Chapter 5 of the book Knowledge Representation.
The notion of context is indispensable for any theory of meaning, but no consensus has been reached about the formal treatment of context. Some of the conflicting approaches result from an ambiguity in the informal senses of the word. Dictionaries list two major senses:
Peirce's friend and fellow pragmatist William James (1897) gave an example of the importance of purpose in determining what should be included in a context:
Can we realize for an instant what a cross-section of all existence at a definite point of time would be? While I talk and the flies buzz, a sea gull catches a fish at the mouth of the Amazon, a tree falls in the Adirondack wilderness, a man sneezes in Germany, a horse dies in Tartary, and twins are born in France. What does that mean? Does the contemporaneity of these events with one another, and with a million others as disjointed, form a rational bond between them, and unite them into anything that means for us a world? Yet just such a collateral contemporaneity, and nothing else, is the real order of the world. It is an order with which we have nothing to do but to get away from it as fast as possible. As I said, we break it: we break it into histories, and we break it into arts, and we break it into sciences; and then we begin to feel at home. We make ten thousand separate serial orders of it, and on any one of these we react as though the others did not exist.
The real world or any imaginary, hypothetical, or planned world is far too big and complex for any agent, human or robotic, to comprehend in all its richness. Smaller chunks are easier to think about than infinite worlds, but for a theory of semantics, there is an even more fundamental issue than size: the purpose that explains why an agent has selected one chunk rather than another. A cat, for example, will pay much more attention to the sound of a can opener than to a human conversation or a musical selection. For any agent, purpose determines what aspects of the world constitute a meaningful situation.
Theories of semantics based on possible worlds cannot derive purpose or intention from collections of worlds, no matter how big, or from collections of situations, no matter how small. In their theory of situations, Barwise and Perry (1983) tried to use finite situations rather than infinite worlds as a basis for deriving the semantics of propositional attitude verbs, such as hope, wish, want, or fear. But the situations they used as examples were not arbitrary, random chunks of the world. For every sample situation in their book, there was an unstated reason why some agent focussed attention on that situation rather than any other. The reason why an agent selects a situation as a topic of interest is a more fundamental clue to its meaning than the situation itself.
This paper discusses several models for languages that go beyond classical first-order logic: the possible worlds of Leibniz, Kripke, and Montague; the model sets by Hintikka; the situations of Barwise and Perry; and the contexts by Peirce and McCarthy. Of these, contexts are the most convenient and computable. But contexts alone do not introduce any more purpose than possible worlds, model sets, or situations. Purpose or intentionality, which can only be introduced by some agent (who need not be human), must be incorporated into the fundamental structures of a semantic theory. In his theories of logic and semiotics, C. S. Peirce addressed the relationship between a universe of discourse or state of affairs and the intention of some agent who selects it. His approach can be combined with techniques introduced by Michael Dunn (1973), which enable modal semantics based on possible worlds to be reinterpreted in terms of the laws that determine the modality. The next step from modality to intentionality requires a shift of focus from the laws to the agents who legislate them.
In 1883, Peirce invented the algebraic notation for predicate calculus. A dozen years later, he developed a graphical notation that more clearly distinguished contexts. Figure 1 shows his graph notation for delimiting the context of a proposition to be discussed. In explaining that graph, Peirce (1898) said "When we wish to assert something about a proposition without asserting the proposition itself, we will enclose it in a lightly drawn oval." The line attached to the oval links it to a relation that makes a metalevel assertion about the nested proposition.
Figure 1: One of Peirce's graphs for talking about a proposition
The oval supports the basic syntactic function of grouping related information in a package. But besides notation, Peirce also developed a theory of the semantics and pragmatics of contexts and the rules of inference for importing and exporting information into and out of the contexts. To support first-order logic, the only necessary metalevel relation is negation. By combining negation with the existential-conjunctive subset of logic, Peirce developed his existential graphs (EGs), which are based on three logical operators and an open-ended number of relations:
To illustrate the use of negative contexts for representing FOL, Figure 2 shows an existential graph and a conceptual graph for the sentence If a farmer owns a donkey, then he beats it. This sentence is one of a series of examples used by medieval logicians to illustrate issues in mapping language to logic. The EG on the left has two ovals with no attached lines; by default, they represent negations. It also has two lines of identity, represented as linked bars: one line, which connects farmer to the left side of owns and beats, represents an existentially quantified variable ($x); the other line, which connects donkey to the right side of owns and beats represents another variable ($y).
Figure 2: EG and CG for "If a farmer owns a donkey, then he beats it."
When the EG of Figure 2 is translated to predicate calculus, farmer and donkey map to monadic predicates; owns and beats map to dyadic predicates. If a relation is attached to more than one line of identity, the lines are ordered from left to right by their point of attachment to the name of the relation. With the implicit conjunctions represented by the Ù symbol, the result is an untyped formula:
~($x)($y)(farmer(x) Ù donkey(y) Ù owns(x,y) Ù ~beats(x,y)).In CGs, a context is defined as a concept whose referent field contains nested conceptual graphs. Since every context is also a concept, it can have a type label, coreference links, and attached conceptual relations. Syntactically, Peirce's ovals are squared off to form boxes, and the negation is explicitly marked by a ¬ symbol in front of the box. The primary difference between EGs and CGs is in the treatment of lines of identity. In EGs, the lines serve two different purposes: they represent existential quantifiers, and they show how the arguments are connected to the relations. In CGs, those two functions are split: the concepts [Farmer] and [Donkey] represent typed quantifiers ($x:Farmer) and ($y:Donkey), and arcs marked with numbers or arrows show the order of the arguments connected to the relations. In the inner context, the two concepts represented as [T] are connected by coreference links to concepts in the outer context. The CG maps to a typed formula that is equivalent to the untyped formula for the EG:
~($x:Farmer)($y:Donkey)(owns(x,y) Ù ~beats(x,y)).The arrow pointing toward the relation indicates the first arc, and the arrow pointing away indicates the last arc; if a relation has n>2 arcs, they are numbered from 1 to n. For more examples of CGs and their translation to predicate calculus and the Knowledge Interchange Format (KIF), see the tutorial.
A nest of two ovals, as in Figure 2, is what Peirce called a scroll. It represents implication, since ~(pÙ~q) is equivalent to pÉq. Using the É symbol, the two formulas may be rewritten
("x)("y)((farmer(x) Ù donkey(y) Ù owns(x,y)) É beats(x,y)).The algebraic formulas with the É symbol illustrate a peculiar feature of predicate calculus: in order to keep the variables x and y within the scope of the quantifiers, the existential quantifiers in the phrases a farmer and a donkey must be moved to the front of the formula and be translated to universal quantifiers. This puzzling feature of logic has posed a problem for linguists and logicians since the middle ages.("x:Farmer)("y:Donkey)(owns(x,y) É beats(x,y)).
Besides attaching a relation to an oval, Peirce also used colors or tinctures to distinguish contexts other than negation. Figure 3 shows one of his examples with red to indicate possibility. The graph contains four ovals: the outer two form a scroll for if-then; the inner two represent possibility (red) and impossibility (red inside a negation). The outer oval may be read If there exist a person, a horse, and water; the next oval may be read then it is possible for the person to lead the horse to the water and not possible for the person to make the horse drink the water.
Figure 3: EG for "You can lead a horse to water, but you can't make him drink."
The notation "¾leads¾to¾" represents a triad or triadic relation leadsTo(x,y,z), and "¾makes¾drink¾" represents makesDrink(x,y,z). In the algebraic notation with the symbol à for possibility, Figure 3 maps to the following formula:
~($x)($ y)($z)(person(x) Ù horse(y) Ù water(z) Ù ~(àleadsTo(x,y,z) Ù ~àmakesDrink(x,y,z)) ).With the symbol É for implication, this formula becomes
("x)("y)("z)((person(x) Ù horse(y) Ù water(z)) É (àleadsTo(x,y,z) Ù ~àmakesDrink(x,y,z)) ).This version may be read For all x, y, and z, if x is a person, y is a horse, and z is water, then it is possible for x to lead y to z, and not possible for x to make y drink z. These readings, although logically explicit, are not as succinct as the proverb You can lead a horse to water, but you can't make him drink.
Discourse representation theory. The logician Hans Kamp once spent a summer translating English sentences from a scientific article to predicate calculus. During the course of his work, he was troubled by the same kinds of irregularities that puzzled the medieval logicians. In order to simplify the mapping from language to logic, Kamp (1981a,b) developed discourse representation structures (DRSs) with an explicit notation for contexts. In terms of those structures, Kamp defined the rules of discourse representation theory for mapping quantifiers, determiners, and pronouns from language to logic (Kamp & Reyle 1993).
Although Kamp had not been aware of Peirce's existential graphs, his DRSs are structurally equivalent to Peirce's EGs. The diagram on the left of Figure 4 is a DRS for the donkey sentence, If there exist a farmer x and a donkey y and x owns y, then x beats y. The two boxes connected by an arrow represent an implication where the antecedent includes the consequent within its scope.
Figure 4: EG and DRS for "If a farmer owns a donkey, then he beats it."
The DRS and EG notations look quite different, but they are exactly isomorphic: they have the same primitives, the same scoping rules for variables or lines of identity, and the same translation to predicate calculus. Therefore, the EG and DRS notations map to the same formula:
~($x)($y)(farmer(x) Ù donkey(y) Ù owns(x,y) Ù ~beats(x,y)).Peirce's motivation for the EG contexts was to simplify the logical structure and rules of inference. Kamp's motivation for the DRS contexts was to simplify the mapping from language to logic. Remarkably, they converged on isomorphic representations. Therefore, Peirce's rules of inference and Kamp's discourse rules apply equally well to contexts in the EG, CG, or DRS notations. For notations with a different structure, such as predicate calculus, those rules cannot be applied without major modifications.
Resolving indexicals. Besides inventing a logical notation for contexts, Peirce coined the term indexical for context-dependent references, such as pronouns and words like here, there, and now. In CGs, the symbol # represents the general indexical, which is usually expressed by the definite article the. More specific indexicals are marked by a qualifier after the # symbol, as in #here, #now, #he, #she, or #it. Figure 5 shows two conceptual graphs for the sentence If a farmer owns a donkey, then he beats it. The CG on the left represents the original pronouns with indexicals, and the one on the right replaces the indexicals with the coreference labels ?x and ?y.
Figure 5: Two conceptual graphs for "If a farmer owns a donkey, then he beats it."
In the concept [Animate: #he], the label Animate indicates the semantic type, and the indexical #he indicates that the referent must be found by a search for some type of Animate entity for which the masculine gender is applicable. In the concept [Entity: #it], the label Entity is synonymous with T, which may represent anything, and the indexical #it indicates that the referent has neuter gender. The search for referents starts in the inner context and proceeds outward to find concepts of an appropriate type and gender. The CG on the right of Figure 5 shows the result of resolving the indexicals: the concept for he has been replaced by [?x] to show a coreference to the farmer, and the concept for it has been replaced by [?y] to show a coreference to the donkey.
Predicate calculus does not have a notation for indexicals, and its syntax does not show the context structure explicitly. Therefore, the CG on the left of Figure 5 cannot be translated directly to predicate calculus. After the indexicals have been resolved, the CG on the right can be translated to the following formula:
("x:Farmer)("yDonkey)("z:Own) (expr(z,x) Ù thme(z,y)) É ($w:Beat)(agnt(w,x) Ù ptnt(w,y)) ).Note that this formula and the graph it was derived from are more complex than the CG in Figure 2. In order to compare the EG and CG directly, Figure 2 represented the verbs by relations Owns and Beats, which do not explicitly show the linguistic roles. In Figure 5, the concept Own represents a state with an experiencer (Expr) and a theme (Thme). The concept Beat, however, represents an action with an agent (Agnt) and a patient (Ptnt). In general, the patient of an action is more deeply affected or transformed than a theme. For further discussion of these linguistic relations, see the web page on thematic roles.
In analyzing the donkey sentences, the scholastics defined transformations or conversion rules from one logical form to another. As an example, a sentence with the word every can be converted to an equivalent sentence with an implication. The sentence Every farmer who owns a donkey beats it is equivalent to the one represented in Figures 2, 4, and 5. In CGs, the word every maps to a universal quantifier in the referent of some concept:
[ [Farmer: l]¬(Expr)¬[Own]®(Thme)®[Donkey]: "]- (Agnt)¬[Beat]®[Entity: #it].In this graph, the quantifier " does not range over the type Farmer, but over the subtype defined by the nested lambda expression: just those farmers who own a donkey. The quantifier " is an example of a defined quantifier, which is not one of the primitives in the basic CG notation. It is defined by a rule, which generates the following CG in the if-then form:
[If: [Farmer: *x]¬(Expr)¬[Own]®(Thme)®[Donkey] [Then: [?x]¬(Agnt)¬[Beat]®[Entity: #it]] ].This graph, which may be read If a farmer x owns a donkey, then x beats it, is halfway between the two graphs in Figure 5. The indexical that relates the nested agent of beating to the farmer has already been resolved to the coreference pair *x-?x by the macro expansion. The second indexical for the pronoun it remains to be resolved to the donkey. This example shows how two sentences that have different surface structures may be mapped to different semantic forms, which are then related by a separate inference step.
The expansion of a universal quantifier to an implication has been known since medieval times. But the complete catalog of all the rules for resolving indexicals is still an active area of research in linguistics and logic. For the sentence You can lead a horse to water, but you can't make him drink, many more conversions must be performed to generate the equivalent of Peirce's EG in Figure 3. The first step would be the generation of a logical form with indexicals, such as the CG in Figure 6, which may be read literally It is possible (Psbl) for you to lead a horse to water, but it is not possible (¬Psbl) for you to cause him to drink the liquid. The relation ¬Psbl is defined by a lambda expression in terms of ¬ and Psbl:
¬Psbl º ¬[Proposition: (Psbl)®[Possible: l]].
Figure 6: CG for "You can lead a horse to water, but you can't make him drink."
A parser and semantic interpreter that did a purely local or context-free analysis of the English sentence could generate the four concepts marked as indexicals by # symbols in Figure 6:
Conversational implicatures. Sometimes no suitable referent for an indexical can be found. In such a case, the person who hears or reads the sentence must make some further assumptions about implicit referents. The philosopher Paul Grice (1975) observed that such assumptions, called conversational implicatures, are often necessary to make sense out of the sentences in ordinary language. They are justified by the charitable assumption that the speaker or writer was trying to make a meaningful statement, but for the sake of brevity, happened to leave some background information unspoken. To resolve the indexicals in Figure 6, the listener would have to make the following kinds of assumptions to fill in the missing information:
[If: [Person: *x]- - -[Person: #you] [Then: ...]].The entire graph in Figure 6 would be inserted in place of the three dots in the then part; then every occurrence of #you could be replaced by ?x. The resulting graph could be read If there exists a person x, then x can lead a horse to water, but x can't make him drink the liquid.
In Figure 6, neither of these conditions holds. To make the second condition true, the antecedent [Horse] can be exported or lifted to some containing context, such as the context of the hypothetical reader x. This assumption has the effect of treating the horse as hypothetical as the person x. After a coreference label is assigned to the concept [Horse: *y], the indexical #he could be replaced by ?y.
[If: [Person: *x] [Horse: *y] [Water: *z] [Then: [Proposition: (Psbl)®[Proposition: [Person: ?x]¬[Lead]- (Thme)®[Horse: ?y] (Dest)®[Water: ?z] ]]®(But)- [Proposition: (¬Psbl)®[Proposition: [Person: ?x]¬(Agnt)¬[Cause]- (Rslt)®[Situation: [Animate: ?y]¬(Agnt)¬[Drink]®(Ptnt)®[Liquid: ?z]] ]]]].This CG may be read If there exist a person x, a horse y, and water z, then the person x can lead the horse y to water z, but the person x can't make the animate being y drink the liquid z. This graph is more detailed than the EG in Figure 3, because it explicitly shows the conjunction but and the linguistic roles Agnt, Thme, Ptnt, Dest, and Rslt. Before the indexicals are resolved, the type labels are needed to match the indexicals to their antecedents. Afterward, the bound concepts [Person: ?x], [Horse: ?y], [Animate: ?y], [Water: ?z], and [Liquid: ?z] could be simplified to just [?x], [?y], or [?z].
As this example illustrates, indexicals frequently occur in the intermediate stages of translating language to logic, but their correct resolution may require nontrivial assumptions. Many programs in AI and computational linguistics are able to follow the rules of discourse representation theory to resolve indexicals. The problem of making the correct assumptions about conversational implicatures is more difficult. The kinds of assumptions needed to understand ordinary conversation are similar to the assumptions that are made in nonmonotonic reasoning. Both of them depend partly on context-independent rules of logic and partly on context-dependent background knowledge.
Leibniz introduced possible worlds as the foundation for modal semantics: a proposition p is necessarily true in the real world if it is true in every possible world, and p is possible in the real world if there is some accessible world in which it happens to be true. In his algebraic notation for predicate calculus, Peirce followed Leibniz by representing necessity with a universal quantifier P_{w}, in which the variable w ranges over all "states of affairs." In his graphic notation for logic, Peirce used a pad of paper instead of a single "sheet of assertion." Graphs that are necessarily true are copied on every sheet; those that are possibly true are drawn on some, but not all sheets. The top sheet contains assertions about the actual state of affairs, and the other sheets, which may be potentially infinite, describe related states of affairs that are possible relative to the actual state.
Axioms for modal logic. The philosopher Clarence Irving Lewis (1918), who was strongly influenced by Peirce, introduced the diamond symbol à for representing possibility in the algebraic notation. If p is any proposition, then àp means p is possibly true. For necessity, a box op is used to mean p is necessarily true. Either symbol, à or o, can be taken as a primitive, and the other can be defined in terms of it:
op º ~à~p.
àp º ~o~p.
o(pÉq) É (opÉoq).
System T does not include axioms for iterated modalities, such as àoàp, which says that p is possibly necessarily possible. Such mind-boggling combinations seldom occur in English, but they may arise in the intermediate stages of a proof. To relate iterated modalities to simple modalities and to one another, Lewis and Langford (1932) defined two additional axioms, called S4 and S5, which may be added to System T:
("x)oP(x) É o("x)P(x).
System T combined with axioms S4, S5, and BF is one of the strongest versions of modal logic, but it is often too strong. In a version of modal logic called deontic logic, op is interpreted p is obligatory, and àp is interpreted p is permissible. In a perfect world, all the axioms of System T would be true. But since people are sinners, some axioms and theorems of System T cannot be assumed. The axiom, opÉp, would be violated by a sin of omission because some obligatory actions are not performed. The theorem, pÉàp, would be violated by a sin of commission because some actions that are performed are not permitted.
Kripke's worlds. Peirce's notation with a universal quantifier P_{w} for necessity and an existential quantifier S_{w} for possibility cannot be used in statements that contain one modal operator within the scope of another. The iterated modalities oo in Axiom S4 and oà in Axiom S5 would be represented by a sequence of two quantifiers for the same variable w, which would make one of quantifiers redundant or irrelevant. To interpret such iterated modal operators, Saul Kripke (1963a,b) discovered an ingenious technique based on model structures having three components:
w|=p º F(p,w)=T.w|=~p º F(p,w)=F.
àp º ($v)(R(u,v) Ù F(p,v)=T).
op º ("v)(R(u,v) É F(p,v)=T).
àp º ($v:R(u,v)) F(p,v)=T.The accessibility relation R(u,v) introduces the extra variables needed to distinguish different ranges for different quantifiers. Therefore, if p is necessarily necessary, the iterated modality oop can be defined:op º ("v:R(u,v)) F(p,v)=T.
oop º ("v:R(u,v))("w:R(v,w)) F(p,w)=T.Now the two quantifiers are distinct because one ranges over v and the other ranges over w. Kripke's most important contribution was to show how Lewis's axioms determine constraints on the accessibility relation R:
reflexive(R) º ("w)R(w,w).
transitive(R) º ("u,v,w)((R(u,v) Ù R(v,w)) É R(u,w)).
symmetric(R) º ("u,v)(R(u,v) É R(v,u)).
The world of Sherlock Holmes stories, for example, is similar enough to the real world w_{0} that it could be in the same equivalence class. The proposition that Sherlock Holmes assisted Scotland Yard would be possible in w_{0} if there were some accessible world w in which it is true:
($w:R(w_{0},w)) F("Sherlock Holmes assisted Scotland Yard",w)=T.A world with cartoon characters like talking mice and ducks, however, is too remote to be accessible from the real world. Therefore, it is not possible for ducks to talk in w_{0}. Business contracts further partition the cartoon worlds into disjoint classes: the world of Disney characters is not accessible from the world of Looney Tune characters. Therefore, Donald Duck can talk to Mickey Mouse, but he can't talk to Bugs Bunny or Daffy Duck.
Criticisms of possible worlds. Possible worlds are a metaphor for interpreting modality, but their ontological status is dubious. Truth is supposed to be a relationship between a statement and the real world, not an infinite family of fictitious worlds. In Candide, Voltaire satirized Leibniz's notion of possible worlds. In that same tradition, Quine (1948) ridiculed the imaginary inhabitants of possible worlds:
Take, for instance, the possible fat man in that doorway; and, again, the possible bald man in that doorway. Are they the same possible man, or two possible men? How do we decide? How many possible men are there in that doorway? Are there more possible thin ones than fat ones? How many of them are alike? Or would their being alike make them one?After Kripke developed his model structures for possible worlds, Quine (1972) noted that models prove that the axioms are consistent, but they don't explain what the modalities mean:
The notion of possible world did indeed contribute to the semantics of modal logic, and it behooves us to recognize the nature of its contribution: it led to Kripke's precocious and significant theory of models of modal logic. Models afford consistency proofs; also they have heuristic value; but they do not constitute explication. Models, however clear they be in themselves, may leave us at a loss for the primary, intended interpretation.
Quine was never sympathetic to modal logic and the semantics of possible worlds, but even people who are actively doing research on the subject have difficulty in making a convincing case for the notion of accessibility between worlds. Following is an attempted explanation by two authors of a widely used textbook (Hughes and Cresswell 1968):
This notion of one possible world's being accessible to another has at first sight a certain air of fantasy or science fiction about it, but we might attach quite a sober sense to it in the following way. We can conceive of various worlds which would differ in certain ways from the actual one (a world without telephones, for example). But our ability to do this is at least partly governed by the kind of world we actually live in: the constitution of the human mind and the human body, the languages which exist or do not exist, and many other things, set certain limits to our powers of conceiving. We could then say that a world, w_{2}, is accessible to a world, w_{1}, if w_{2} is conceivable by someone living in w_{1}, and this will make accessibility a relation between worlds... (p. 77)Hughes and Cresswell explained the dyadic accessibility relation in terms of a triadic relation of "conceivability" by someone living in one imaginary world who tries to imagine another one. Their explanation suggests that the person who determines which worlds are conceivable is at least as significant to the semantics of modality as the worlds themselves. To capture the "primary, intended interpretation," the formalism must show how the accessibility relation can be derived from some agent's conceptions.
By relating the modal axioms to model structures, Kripke showed the interrelationships between the axioms and the possible worlds. But the meaning of those axioms remains hidden in the accessibility relation R and the evaluation function F. The functional notation F(w,p)=T gives the impression that F computes a truth value. But this impression is an illusion: the set of worlds K is an undefined set given a priori; the relation R and function F are merely assumed, not computed. Nonconstructive assumptions cannot be used to compute anything, nor can they explain how Quine's possible fat men and thin men might be "accessible" from the real world with an empty doorway.
Hintikka's model sets. Instead of assuming possible worlds, Jaakko Hintikka (1961, 1963) independently developed an equivalent semantics for modal logic based on collections of propositions, which he called model sets. He also assumed an alternativity relation between model sets, which serves the same purpose as Kripke's accessibility relation between worlds. As collections of propositions, Hintikka's model sets describe Kripke's possible worlds:
("w:World)($M:ModelSet) M = { p | w|=p}.This formula defines a mapping from Kripke's worlds to Hintikka's model sets: for any possible world w, there exists a model set M, which consists of all propositions p that are semantically entailed by w. In effect, the model set M describes everything that can be known about w.
By replacing the imaginary worlds with sets of propositions, Hintikka took an important step toward making them more formal. The mapping from possible worlds to model sets enables any theory about real or imaginary worlds to be restated in terms of the propositions that describe those worlds. But that mapping, by itself, does not address Quine's criticisms. Hintikka's alternativity relation between model sets is just as mysterious and undefined as Kripke's accessibility relation between worlds. Sets of formulas with an undefinable relation between them do not explain why one set is considered "accessible" from another.
Barwise and Perry's situations. To avoid infinite worlds with all the complexity of William James's example, Barwise and Perry (1983) proposed situation semantics as a theory that relates the meaning of sentences to smaller, more manageable chunks called situations. Each situation is a configuration of some aspect of a world in a bounded region of space and time. It may be a static configuration that remains unchanged for some period of time, or it may be a process that is causing changes. It may include people and things with their actions and speech; it may be real or imaginary; and its time may be past, present, or future.
In their book, Barwise and Perry identified a situation with a bounded region of space-time. But as William James observed, an arbitrary region of space and time contains "disjointed events" with no "rational bond between them." A meaningful situation is far from arbitrary, as the following examples illustrate:
In discussing the development of situation theory, Keith Devlin (1991a) observed that the definitions were stretched to the point where situations "include, but are not equal to any of simply connected regions of space-time, highly disconnected space-time regions, contexts of utterance (whatever that turns out to mean in precise terms), collections of background conditions for a constraint, and so on." After further discussion, Devlin admitted that they cannot be defined: "Situations are just that: situations. They are abstract objects introduced so that we can handle issues of context, background, and so on."
McCarthy's contexts. John McCarthy is one of the founding fathers of AI, whose collected work (McCarthy 1990) has frequently inspired and sometimes revolutionized the application of logic to knowledge representation. In his "Notes on Formalizing Context," McCarthy (1993) introduced the predicate ist(C,p), which may be read "the proposition p is true in context C." For clarity, it will be spelled out in the form isTrueIn(p,C). As illustrations, McCarthy gave the following examples:
One of McCarthy's reasons for developing a theory of context was his uneasiness with the proliferation of new logics for every kind of modal, temporal, epistemic, and nonmonotonic reasoning. The ever-growing number of modes presented in AI journals and conferences is a throwback to the scholastic logicians who went beyond Aristotle's two modes necessary and possible to permissible, obligatory, doubtful, clear, generally known, heretical, said by the ancients, or written in Holy Scriptures. The medieval logicians spent so much time talking about modes that they were nicknamed the modistae. The modern logicians have axiomatized their modes and developed semantic models to support them, but each theory includes only one or two of the many modes. McCarthy (1977) observed,
For AI purposes, we would need all the above modal operators in the same system. This would make the semantic discussion of the resulting modal logic extremely complex.Instead of an open-ended number of modes, McCarthy hoped to develop a simple, but universal mechanism that would replace modal logic with first-order logic supplemented with metalanguage about contexts. His student R. V. Guha (1991) implemented contexts in the Cyc system and showed that a first-order object language supplemented with a first-order metalanguage could support versions of modal, temporal, default, and higher-order reasoning. Stuart Shapiro and his colleagues have implemented versions of propositional semantic networks, which support similar structures in a form that maps more directly to logic (Shapiro 1979; Maida & Shapiro 1982; Shapiro & Rappaport 1992). Shapiro's propositional nodes serve the same purpose as Peirce's ovals and McCarthy's contexts.
McCarthy, Shapiro, and their colleagues have shown that contexts are valuable for building knowledge bases, but they have not clearly distinguished the syntax of contexts from the semantics of some subject matter. McCarthy's predicate isTrueIn mixes the syntactic notion of containment (is-in) with the semantic notion of truth (is-true-of). One way to resolve the semantic status of contexts is to derive them from Barwise and Perry's situations:
("s:Situation)($C:Context) { p | isTrueIn(p,C)} = { q | s|=q}.This formula maps situations to contexts: for every situation s, there exists a context C, whose set of true propositions is the same as the set of propositions entailed by s. Devlin (1991b) coined the term infon for a propostion that is entailed by a situation. With that terminology, the formula could be summarized by the following English statement:
Computability, although desirable, is not sufficient to explain meaning. Truth is more significant than computability, but truth conditions, by themselves, cannot determine relevance. As William James observed, infinitely many true statements could be made about the real world or any model of it, and the overwhelming majority of them are irrelevant to what anyone might want to say or do. The verbs that express the kind of relevance, such as wanting, fearing, or hoping, are called propositional attitudes. But their semantics depends critically on the agent whose intention or attitude toward some situation for some purpose determines the relevance of propositions about it.
Figure 7 shows how different kinds of contexts may be distinguished by their relationship to what is actual and to the intentions of some agent. An actual context represents something that is true. A modal context represents something that is related to what is actual by some modality, such as possibility or necessity. An intentional context is related to what is actual by some agent who determines what is intended.
Figure 7: Three kinds of contexts
In the upper left of Figure 7, the context labeled actual contains a graph that represents something that is true of some aspect of the world. That graph might be an EG or CG that states a proposition, or it might be a Tarski-style model, in which the nodes and arcs represent individuals and relations in the world. For an actual context, Tarski's model theory or Peirce's logically equivalent method of endoporeutic can determine the truth in terms of an actual state of affairs without considering any other possibilities or anyone's intentions about them.
In the upper right, the modal context represents a possibility relative to what is actual. To define the semantics of modality, Kripke extended Tarski's single model to an open-ended, possibly infinite family of models related by a dyadic accessibility relation R(w_{0},w_{1}), which says that the world (or model) w_{1} is accessible by some modification of the actual world w_{0}. Kripke's theory, however, treats the relation R as an undefined primitive: it does not specify the conditions for an accessible modification of w_{0} to form w_{1}.
The diagram at the bottom shows an agent whose intention relates an actual context to an intended context. The simplest way to extend Kripke's theory to handle intentionality is to add an extra argument in the accessibility relation to name the agent. Philip Cohen and Hector Levesque (1990) used that approach to define two new kinds of accessibility relations:
Dunn's laws and facts. If the accessibility relation is assumed as a primitive, modality and intentionality cannot be explained in terms of anything more fundamental. To make accessibility a derived relation, Michael Dunn (1973) replaced Kripke's undefined worlds a more detailed construction in terms of laws and facts. For every Kripke world w, Dunn assumed a pair <M,L>, where M is a Hintikka-style model set called the facts of w and L is a subset of M called the laws of w. Finally, Dunn showed how the accessibility relation from one world to another can be derived from constraints on which propositions are chosen as laws. As a result, the accessibility relation is no longer primitive, and the modal semantics does not depend on imaginary worlds. Instead, modality depends on the choice of laws, which could be laws of nature or merely human rules and regulations.
Philosophers since Aristotle have recognized that modality is related to laws; Dunn's innovation lay in making the relationships explicit. Let <M_{1},L_{1}> be a pair of facts and laws that describe a possible world w_{1}, and let the pair <M_{2},L_{2}> describe a world w_{2}. Dunn defined accessibility from the world w_{1} to the world w_{2} to mean that the laws L_{1} are a subset of the facts in M_{2}:
R(w_{1},w_{2}) º L_{1}ÌM_{2}.According to this definition, the laws of the first world w_{1} remain true in the second world w_{2}, but they may be demoted from the status of laws to just ordinary facts. Dunn then restated the definitions of possibility and necessity in terms of laws and facts. In Kripke's version, possibility àp means that p is true of some world w accessible from the real world w_{0}:
àp º ($wWorld)(R(w_{0},w) Ù w|=p).By substituting the laws and facts for the possible worlds, Dunn derived an equivalent definition:
àp º ($M:ModelSet)(laws(M)ÌM_{0} Ù pÎM).Now possibility àp means that there exists a model set M whose laws are a subset of the facts of the real world M_{0} and p is a fact in M. By the same substitutions, the definition of necessity becomes
op º ("M:ModelSet)(laws(M)ÌM_{0} É pÎM).Necessity op means that in every model set M whose laws are a subset of the facts of the real world M_{0}, p is also a fact in M.
Dunn performed the same substitutions in Kripke's constraints on the accessibility relation. The result is a restatement of the constraints in terms of the laws and facts:
Dunn's theory is a compatible refinement of Kripke's theory, since any Kripke model structure (K,R,F) can be converted to one of Dunn's model structures in two steps:
Databases and Knowledge bases. For computational purposes, Kripke's possible worlds must be converted to symbolic representations that can be stored and manipulated in a database or knowledge-based system. The mapping from Kripke's models to Dunn's models is essential for replacing the physical world or some imaginary world with a computable symbolic representation. Following are the correspondences between Dunn's semantics and the common terminology of databases and knowledge bases:
As an examples, a law or DB constraint might state that every person has two parents, one male and one female; and each person's age must be less than the age of either parent. Even if the facts stored in the database are incomplete, there must be room for adding the names and birthdates of the parents when they become known.
The ground-level facts, for example, might state the names and birthdates for everyone known to the system. The laws could be used to verify that nobody has more than two parents and to deduce family relationships such as siblings, grandparents, and cousins. During normal operations, the laws would not change, but the ground-level facts could be updated to record births, deaths, and marriages.
If some person already has two parents named in the database, no update that named a third parent for that person would be permitted.
The various axioms for modality correspond to different options for permissible changes to a database or knowledge base. In effect, they specify policies for how a knowledge engineer or database administrator could modify the laws to accommodate changes in the known scientific principles or changes in the business or social structures. Following are the policies that correspond to the modal systems T, S4, and S5:
Mapping Possible Worlds to Contexts. The primary difference between model sets and contexts is size: Hintikka defined model sets as maximally consistent sets of propositions that could describe everything in the real world or any possible world. But as William James observed, a collection of information about the entire world is far too large and disjointed to be comprehended and manipulated in any meaningful way. A context is an excerpt from a model set in the same sense that a situation is an excerpt from a possible world. It could contain a finite set of propositions that describe some situation, even though the deductive closure of that set might be infinite.
Figure 8: Ways of mapping worlds to contexts
Figure 8 shows mappings from a Kripke possible world w to a description of w as a Hintikka model set M or a finite excerpt from w as a Barwise and Perry situation s. Then M and s may be mapped to a Peirce-McCarthy context C. This is an example of a commutative diagram, which shows a family of mappings that lead to the same result by multiple routes. Those routes result from the different ways of combining two kinds of mappings:
Completing the pushout. In the commutative diagram of Figure 8, the downward arrow on the left corresponds to Dunn's mapping of possible worlds to laws and facts, and the rightward arrow at the top corresponds to Barwise and Perry's mapping of possible worlds to situations. The branch of mathematics called category theory has methods of completing such diagrams by deriving the other mappings. Given the two arrows at the left and the top, the technique called a pushout defines the two arrows on the bottom and the right:
Situations as pullbacks. The inverse of a pushout, called a pullback, is an operation of category theory that "pulls" some structure or family of structures backward along an arrow of a commutative diagram. For the diagram in Figure 8, the model set M and the context C are symbolic structures that have been studied in logic for many years. The situation s, as Devlin observed, is not as clearly defined. One way to define a situation is to assume the notion of context as more basic and to say that a situation s is whatever is described by a context C. In terms of the diagram of Figure 8, the pullback would start with the two mappings from w to M and from M to C. Then the situation s in the upper right and the two arrows w®s and s®C would be derived by a pullback from the starting arrows w®M and M®C.
The definition of situations in terms of contexts may be congenial to logicians for whom abstract propositions are familiar notions. For people who prefer to think about physical objects, the notion of a situation as a chunk of the real world may seem more familiar. The commutative diagram provides a way of reconciling the two views: starting with a situation, the pushout determines the propositions in the context; starting with a context, the pullback defines the situation. The two complementary views are useful for different purposes: for a mapmaker, the context is derived as a description of some part of the world; for an architect, the concrete situation is derived by some builder who follows an abstract description.
Legislating the laws. Although Dunn's semantics explains the accessibility relation in terms of laws, the source of the laws themselves is never explained. In the semantics for intentionality, however, the laws are explicitly chosen by some agent who may be called the lawgiver. The entailment operator s|=p relates an entity s to a proposition p that is entailed by s. A triadic relation legislate(a,p,s) could be used to relate an agent a who legislates a proposition p as a law, rule, or regulation for some entity s. The following formula says that Tom legislates some proposition as a rule for a lottery game:
($p:Proposition)($s:LotteryGame)(person(Tom) Ù legislate(Tom,p,s)).By Dunn's convention, the laws L of any entity s must be a subset of the facts entailed by s. That condition may be stated as an axiom:
("a:Agent)("p:Proposition)("s:Entity)(legislate(a,p,s) É s|=p)).This formula says that for every agent a, proposition p, and entity s, if a legislates p as a law of s, then s entails p. Together with Dunn's semantics, the triadic legislation relation formalizes the informal suggestion by Hughes and Cresswell: "a world, w_{2}, is accessible to a world, w_{1}, if w_{2} is conceivable by someone living in w_{1}." Some agent's conceptions become the laws that determine what is necessary and possible in the imaginary worlds. The next step of formalization is to classify the kinds of conceptions that determine the various kinds of modality and intentionality.
In 1906, Peirce introduced colors into his existential graphs to distinguish various kinds of modality and intentionality. Figure 3, for example, used red to represent possibility in the EG for the sentence You can lead a horse to water, but you can't make him drink. To distinguish the actual, modal, and intentional contexts illustrated in Figure 7, three kinds of colors would be needed. Conveniently, the heraldic tinctures, which were used to paint coats of arms in the middle ages, were grouped in three classes: metal, color, and fur. Peirce adopted them for his three kinds of contexts, each of which corresponding to one of his three categories: Firstness (independent conception), Secondness (relative conception), and Thirdness (mediating conception).
Throughout his analyses, Peirce distinguished the logical operators, such as Ù, ~, and $, from the tinctures, which, he said, do not represent
...differences of the predicates, or significations of the graphs, but of the predetermined objects to which the graphs are intended to refer. Consequently, the Iconic idea of the System requires that they should be represented, not by differentiations of the Graphs themselves but by appropriate visible characters of the surfaces upon which the Graphs are marked.In effect, Peirce did not consider the tinctures to be part of logic itself, but of the metalanguage for describing how logic applies to the universe of discourse:
The nature of the universe or universes of discourse (for several may be referred to in a single assertion) in the rather unusual cases in which such precision is required, is denoted either by using modifications of the heraldic tinctures, marked in something like the usual manner in pale ink upon the surface, or by scribing the graphs in colored inks.Peirce's later writings are fragmentary, incomplete, and mostly unpublished, but they are no more fragmentary and incomplete than most modern publications about contexts. In fact, Peirce was more consistent in distinguishing the syntax (oval enclosures), the semantics ("the universe or universes of discourse"), and the pragmatics (the tinctures that "denote" the "nature" of those universes).
Classifying contexts. The first step toward a theory of context is a classification of the types of contexts and their relationships to one another. Any of the tinctured contexts may be nested inside or outside the ovals representing negation. When combined with negation in all possible ways, each tincture can represent a family of related modalities:
Multimodal reasoning. As the multiple axioms for modal logic indicate, there is no single version that applies to all problems. The complexities increase when different interpretations of modality are mixed, as in Peirce's five versions of possibility, which could be represented by colors or by subscripts, such as à_{1}, à_{2}, ..., à_{5}. Each of those modalities is derived from a different set of laws, which interact in various ways with the other laws:
o_{3}à_{1}p É à_{1}p.
By introducing contexts, McCarthy hoped to reduce the proliferation of modalities to a single mechanism of metalevel reasoning about the propositions that are true in a context. By supporting a more detailed representation than the operators à and o, the dyadic entailment relation and the triadic legislation relation support metalevel reasoning about the laws, facts, and their implications. Following are some implications of Peirce's five kinds of possibility:
{} = {p:Proposition | ("a:Agent)("x:Entity)legislate(a,p,x)}.The empty set is the set of all propositions p that every agent a legislates as a law for every entity x.
SubjectiveLaws(a) = {p:Proposition | know(a,p)}.That principle of subjective possibility can be stated in the following axiom:
("a:Agent)("p:Proposition)("x:Entity) (legislate(a,p,x) º know(a, x|=p)).For any agent a, proposition p, and entity x, the agent a legislates p as a law for x if and only if a knows that x entails p.
LawsOfNature = {p:Proposition | ("x:Entity)legislate(God,p,x)}.If God is assumed to be omniscient, this set is the same as everything God knows or SubjectiveLaws(God). What is subjective for God is objective for everyone else.
- CommonKnowledge(a,b) = SubjectiveLaws(a) Ç SubjectiveLaws(b).
Obligatory(x) = {p:Proposition | ($a:Agent) (authority(a,x) Ù legislate(a,p,x)}.This interpretation, which defines deontic logic, makes it a weak version of modal logic since consistency is weaker than truth. The usual modal axioms opÉp and pÉàp do not hold for deontic logic, since people can violate the laws.
To prove that a syntactic notation for contexts is consistent, it is necessary to define a model-theoretic semantics for it. But to show that the model captures "the primary intended interpretation," it is necessary to show how it represents the entities of interest in the application domain. For consistency, this section defines model structures called nested graph models (NGMs), which can serve as the denotation of logical expressions that contain nested contexts. Nested graph models are general enough to represent a variety of other model structures, including Tarski-style "flat" models, the possible worlds of Kripke and Montague, and other approaches discussed in this paper. The mapping from those model structures to NGMs shows that NGMs are at least as suitable for capturing the intented interpretation. Dunn's semantics allows NGMs to do more: the option of representing metalevel information in any context enables statements in one context to talk about the laws and facts of nested contexts and about the intentions of agents who may have legislated the laws.
To illustrate the formal definitions, Figure 9 shows an informal example of an NGM. Every box or rectangle in Figure 9 represents an individual entity in the domain of discourse, and every circle represents a property (monadic predicate) or a relation (predicate or relation with two or more arguments) that is true of the individual to which it is linked. The arrows on the arcs are synonyms for the integers used to label the arcs: for dyadic relations, an arrow pointing toward the circle represents the integer 1, and an arrow pointing away from the circle represents 2; relations with more than two arcs must supplement the arrows with integers. Some boxes contain nested graphs: they represent individuals that have parts or aspects, which are individual entities represented by the boxes in the nested graphs. The relations in the nested boxes may be linked to boxes in the same graph or to boxes in some area outside the box in which they are contained. No relations are ever linked to boxes that are more deeply nested than they are.
Figure 9: A nested graph model (NGM)
Formally, an NGM can be defined in equivalent ways with the terminology of either hypergraphs or bipartite graphs. For convenience in relating the formalism to diagrams such as Figure 9, a nested graph model G is defined as a bipartite graph with four components, G=(A,B,C,L):
An NGM may consist of any number of levels of nested NGMs, but no NGM is nested within itself, either directly or indirectly. If infinite nesting depth is permitted, an NGM could be isomorphic to another NGM nested in itself. In a computer implementation, such nesting could be simulated with a pointer from an inner node to an outer node; but in theory, the outer NGM and the nested NGM are considered to be distinct. In any computer implementation, there must be exactly one outermost NGM in which all the others are nested. In theory, however, infinite NGMs with no outermost level could be considered.
Mapping other models to NGMs. Nested graph models are set-theoretical constructions that can serve as models for a wide variety of logical theories. They can be specialized in various ways to represent many other model structures:
If the sets D and R happen to be uncountably infinite, there would not be enough character strings to serve as labels in L. Therefore, the elements of D and R themselves may be used as labels for the boxes and circles.
For finite models, these steps can be translated to a computer program that constructs G from M. For infinite models, they should be considered a specification rather than a construction.
("x)oP(x) É o("x)P(x).This axiom says that if for every x, some predicate P(x) is necessarily true, then it is necessary that for every x, P(x). It implies that all worlds accessible from a given world must have exactly the same individuals.
To allow quantification over the individuals in the possible worlds, A model G=(A,B,C,L) can be specified by starting with the first three steps for constructing an NGM H for a Kripke-style model and continuing with the following steps:
Since the Barcan formula requires the possible worlds to be partitioned in equivalence classes that have the same individuals, any NGM that satisfies it would require the nested NGMs to be partitioned in equivalence classes with the same labels on their boxes. Nested graph models, however, can support more general models than those that satisfy the Barcan formula. They could, for example, support the notion of counterparts: some privileged NGM V_{0} would represent the real world, and its boxes would be linked to boxes in the outer NGM G by circles with the label "Identity"; the boxes in other nested NGMs would be linked to the outer boxes by circles with the label "Counterpart". Two individuals in different possible worlds would be considered identical if their corresponding boxes were linked to the same box in G by circles labeled "Identity"; they would be considered counterparts if one circle was labeled "Counterpart" and the other was labeled either "Identity" or "Counterpart".
As Quine observed, consistency in terms of a model does not ensure that the model captures "the primary intended interpretation." Six different reasons, organized in three pairs:
Representing Situations and Contexts. The conceptual graph in Figure 10 shows how conceptual graphs can make implicit semantic relationships explicit. At the top is a concept of type Situation, linked by two image relations (Imag) to two different images of that situation: a picture and the associated sound. The description relation (Dscr) links the situation to a proposition that describes some aspect of it. That proposition is linked by three statement relations (Stmt) to statements of the proposition in three different languages: an English sentence, a conceptual graph, and a formula in the Knowledge Representation Language (KIF).
Figure 10: A CG representing a situation of a plumber carrying a pipe
The Imag relation links an entity to an icon that shows what it looks like or sounds like. The Dscr relation or the corresponding predicate dscr(x,p) links an entity x to a proposition p that describes some aspect of x. In the metatheory about logic, the symbol |=, called the double turnstile, is used to say that some proposition p is entailed by some entity x. Semantic entailment x|=p means that the proposition p makes a true assertion about some entity x; an alternate terminology is to say that the entity x satisfies p. Semantic entailment is equivalent to the description predicate dscr(x,p):
("x:Entity)("p:Proposition)(dscr(x,p) º x|=p).Literally, for every entity x and proposition p, x has a description p if and only if x semantically entails p. Informally, the terms semantic entailment, description, and satisfaction have been used by different philosophers with different intuitions, but formally, they are synonymous.
As Figure 10 illustrates, the proposition expressed in any of the three languages represents a tiny fraction of the total information available. Both the sound image and the picture image capture information that is not in the sentence, but even they are only partial representations. A picture may be worth a thousand words, but a situation can be worth a thousand pictures. Yet the less detailed sentences have the advantage of being easier to think about, talk about, and compute.
A Tarski-style model or any of its generalizations by Hintikka, Kripke, Montague and others is a formal basis for determining the truth of a statement in terms of a model of the world.
In his presentation of model-theoretic semantics, Tarski (1935) insisted that his definition of truth applied only to "formalized languages" and that any attempt to apply it to natural language is fraught with "insuperable difficulties." He concluded that "the very possibility of a consistent use of the expression "true sentence" which is in harmony with the laws of logic and the spirit of everyday language seems to be very questionable, and consequently the same doubt attaches to the possibility of constructing a correct definition of this expression." As many logicians have observed, the most that model theory can do is to demonstrate the consistency of one set of axioms relative to another set that is better known or more widely accepted. For pure mathematics, for which applications are irrelevant, consistency is sufficient for truth: it guarantees that the axioms of a theory are satisfiable in some Platonic realm of ideas.
The requirement for introducing agents and their intentions demotes model theory from its role as the source of all meaning, but a model is still useful as a consistency check. For the theory of contexts presented in this paper, a single example is sufficient to prove consistency:
For applied mathematics, the truth of a theory requires a correspondence with structures that are more tangible than Platonic ideas. For applications, model theory must be supplemented with methods of observation and measurement for determining how well the abstract symbols of the theory match their real-world referents and the predicted relationships between them. Yet as philosophers from Hume to Quine have insisted, such a correspondence falls short of an explanation: a mere correspondence with observations could be accidental. A famous example is Bode's "law" for predicting the distance of planets from the sun; it matched the observed orbits for the planets up to Uranus, but it failed when Neptune and Pluto were discovered.
In Peirce's terms, correspondence is an example of Secondness: a dyadic relationship between symbols and their referents. That relationship is a prerequisite for truth, but not an explanation. Explanation requires Thirdness: a triadic predicate that relates a law-like regularity in the universe to the symbols and their referents. For physical laws, the lawgiver who is responsible for that regularity may be personified as God or Nature. For legal, social, contractual, and habitual regularities, the lawgiver is some mortal agent ¾ human, animal, or robot. Any theory of meaning that goes beyond a simple catalog of observations must be stated in terms of agents and their deliberate or habitual legislations.
Stratified levels. To simplify metalevel reasoning, Tarski advocated a method of separating or stratifying the metalevels and the object level. If the object language L_{0} refers to entities in a universe of discourse D, the metalanguage L_{1} refers to the symbols of L_{0} and their relationships to D. The metalanguage is still first order, but its universe of discourse is enlarged from D to L_{0}ÈD. The metametalanguage L_{2} is also first order, but its universe of discourse is L_{1}ÈL_{0}ÈD. To avoid paradoxes, Tarski insisted that no metalanguage L_{n} could refer to its own symbols, but it could refer to the symbols or the domain of any language L_{i} where 0£i<n.
In short, metalevel reasoning is first-order reasoning about the way statements may be sorted into contexts. After the sorting has been done, the propositions in a context can be handled by the usual FOL rules. At every level of the Tarski hierarchy of metalanguages, the reasoning process is governed by first-order rules. But first-order reasoning in language L_{n} has the effect of higher-order or modal reasoning for every language below n. At every level n, the model theory that justifies the reasoning in L_{n} is a conventional first-order Tarskian theory, since the nature of the objects in the domain D_{n} is irrelevant to the rules that apply to L_{n}.
Example. To illustrate the interplay of the metalevel transformations and the object-level inferences, consider the following statement, which includes direct quotation, indirect quotation, indexical pronouns, and metalanguage about belief:
Joe said [#I don't believe [in astrology] but #they say [[#it works] even if #you don't believe [in #it]]].
Joe said [Joe doesn't believe [astrology works] but every person x believes [[astrology works] even if x doesn't believe [astrology works] ]].
Joe believes [Joe doesn't believe [astrology works] and every person x believes [astrology works] ].
Joe believes [Joe doesn't believe [astrology works] and Joe believes [astrology works] ].
Joe believes [p Ù ~p].This transformation exposes the contradiction in the context of Joe's beliefs.
In the process of reasoning about Joe's beliefs, the context [astrology works] is treated as an encapsulated object, whose internal structure is ignored. When the levels interact, however, further axioms are necessary to relate them. Like the iterated modalities ààp and àop, iterated beliefs occur in statements like Joe believes that Joe doesn't believe that astrology works. One reasonable axiom is that if an agent a believes that a believes p, then a believes p:
("a:Agent)("p:Proposition)(believe(a,believe(a,p)) É believe(a,p)).This axiom enables two levels of nested contexts to be collapsed into one. The converse, however, is less likely: many people act as if they believe propositions that they are not willing to admit. Joe, for example, might read the astrology column in the daily newspaper and follow its advice. His actions could be considered evidence that he believes in astrology. Yet when asked, Joe might continue to insist that he doesn't believe in astrology.
All references have been moved to the combined bibliography.
Send comments to John F. Sowa.
Last Modified: