Logics of Probabilistic Reasoning and Imperfect

0 downloads 0 Views 343KB Size Report
ference engines" using various logics including ones for probability reasoning. A logic of ..... free variable, we shall assign truth according to the proportion of domain (i.e. ...... mortal" has a probability of 1, in the sense of Lukasiewicz. 28 ...
Logics of Probabilistic Reasoning and Imperfect Agents  Masood Mortazavi Graduate Group in Logic and Methodology of Science University of California at Berkeley

Contents 1 2 3 4

Introduction Computational and Contemplative Agents Logical Foundations Probability Logics

4.1 Lukasiewicz's Logic . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Inde nite Propositions and Truth Values . . . . . . . . 4.1.2 Lukasiewicz's Analysis of Implication, and the Deductive Principles of His Logic . . . . . . . . . . . . . . . . 4.1.3 Relative Truth Values, Independence and Bayesian Logic 4.1.4 Lukasiewicz's Philosophical Investigations on the Relationship of Probability and Logic . . . . . . . . . . .

2 3 4 5 6 6

9 12 14

 This paper was originally intended to be a review of current state of a airs in rstorder logics for probabilistic reasoning, giving an outline of major works on logics for probabilistic reasoning. However, the concept of the \imperfect agent" was discovered while searching for computational models applicable to intelligent systems based on FOPL ( rst-order logics of probability). The paper has now taken a broader, more philosophical tone to it. It is still inspired by our study of past and present work on FOPL and is particularly indebted to the essay \Logical foundations of Probability" by J. Lukasiewicz [Luk13].

1

4.2 Nilsson's Logic . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Carnap's Views on Reasoning with Probabilities . . . . 4.2.2 Possible Worlds and Probabilities . . . . . . . . . . . . 4.2.3 Entailment and Reasoning with Uncertain Beliefs . . . 4.2.4 Nilsson's Cube and the Convex Hull of Possible Worlds 4.2.5 A Truth Value Range for Entailed Sentences . . . . . . 4.2.6 Summary of Nilsson's Analysis of Entailment . . . . . 4.2.7 Entailment Interpreted . . . . . . . . . . . . . . . . . . 4.2.8 Interdependence of Degrees of Beliefs in Possible Worlds and (Partial) Truth Values . . . . . . . . . . . . . . . . 4.2.9 Conditional Probabilities . . . . . . . . . . . . . . . . . 4.3 Halpern's Logic . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 The Logical Language . . . . . . . . . . . . . . . . . . 4.3.2 Type-1 Probability Structure . . . . . . . . . . . . . . 4.3.3 Type-2 Probability Structure . . . . . . . . . . . . . . 4.3.4 A Note on Type-2 Probability Terms, Lukasiewicz's Conception of Probability Terms and Fuzzy Terms . . 4.3.5 Notes on Justifying the Type-3 Probability Structure and the Problem of Instantiation . . . . . . . . . . . . 4.3.6 Type-3 Probability Structure . . . . . . . . . . . . . . 4.3.7 Some Decidability and Undecidability Results . . . . . 4.3.8 Halpern's Axiom System for (\Domain") Probability Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 Other Problems in Probability Logics 6 Probability Logics and Fuzzy Logic 7 The Imperfect Agent

2

14 16 20 22 23 24 25 25 26 27 27 28 30 31 32 33 35 36 36

38 38 38

If one supposes, as philosophers traditionally have, that one's beliefs are a set of propositions, and reasoning is inference or deduction from members of the set, one is in for trouble, for it is quite clear (though still controversial) that systems relying only on such processes get swamped by combinatorial explosions in the updating e ort. it seems that our entire conception of belief and reasoning must be radically revised if we are to explain the undeniable capacity of human beings to keep their beliefs rougly consonant with the reality they live in. Daniel C. Dennett, \Arti cial Intelligence as Philosophy and as Psychology," Brainstorms: Philosophical Essays on Mind and Psychology (1978)

1 Introduction Why would one be interested in logics of probability reasoning? Much of our knowledge is based on common sense truths. We do not jump o towers or cli s, slap the person sitting next to us in a conference, pour boiling lead on our skin because we have a very clear idea about the consequences of these sorts of actions. However, a large part of our knowledge base seems to improve on the basis of repeated experience of a particular form. The more of this form of experience is interpreted and integrated into our knowledge base, the more accurate become our predictions about the subject. The subjects, in these sorts of circumstances, seem to behave somewhat unpredictably. The famous example of the stocks can be supplemented with the example of how certain illnesses manifest in people, how cancer recedes, how the whether changes, how sunspots appear on the Sun, how loan defaults occur and how rst year ventures make it to a second year. The subject in each case is a ected by a multitude of factors that come together to shape its development. To predict this development, the experienced observers form within their knowledge bases, repertoires of scenarios and assign likelihoods to each scenario. Scenarios become more or less likely as the observers experience grows to incorporate larger and larger body of relevant examples. Systems that behave according to a set of likely scenarios may be intentionally describable, which means that the system behavior can be modeled based on a set of beliefs, desires and goals. Commentators are found talking about how 3

the economy \wants" to make corrections and how the market is \looking for" an equilibrium price. This approach to knowledge engineering can be name the intentional strategy . (See [Denn81] for more on intentions, desires and beliefs.) The physical strategy and the design strategy for knowledge engineering are other possible approaches to knowledge engineering. All these basic approaches to knowledge engineering could be supplemented with \inference engines" using various logics including ones for probability reasoning. A logic of probability reasoning gives a description of how observers may go about forming beliefs in uncertain circumstances and with uncertain beliefs. Circumstances are uncertain when they coninuously challenge, extend and defeat our earlier beliefs. Beliefs are uncertain when they possess partial truth values or a number of likely scenarios. This paper begins by introducing some logical concepts, reviewing the major trends within probability logics, points to the focal research problem in this eld|the problem of instantiation of general logical statements|and expounds the signi cance of this research problem. The paper concludes with a discussion of the idea of \imperfect agent". What is meant by the \imperfect agent" applies in a broad sense to arti cial, computational agents as well as organic beings. The arti cially intelligent agent, when de ned within any practical bounds, will obviously be an imperfect computational agent. Such an agent can be proved to be imperfect (for many practical purposes) due to the computational constraints imposed on the evolution and the growth of its learning and understanding. In this sense, all arti cially intelligent agents are imperfect. The goal of the concluding sections of this paper is not to reiterate these rather obvious facts. The obvious facts, however, may require attention to the extent we are willing to reorient our research activity from seeking domain-speci c perfection to devising better agents|where \better" means more realistic and perhaps less perfect in any one particular domain. We shall rst review possible reasoning models for such imperfect agents which will show how such agents would work based on their beliefs about the world. The direct economic value of such \imperfect" agents may be lost to the entrepreneur since it is best located in a future when one begins to expect 1

1 That these constraints are not just a matter of physical constraints on tools such as

memory or computational speed is a matter deserving a separate discussion.

4

not only computational accurary (at the level of a good calculator) but bold judgment in the absence of perfect information and experience. Measuring the success of an agent according to the value of its judgement in uncertain circumstances has already been testi ed by the agent market in the nance [Coa89, Cas96], investment [Alu96], stocks, options and insurance industries as well as the physical and biological sciences [Raw94, Hun91, Raw89].

2 Computational and Contemplative Agents Intelligence seems to have something to do with an ability to learn and understand . An intelligent agent is an agent which shows powers of reasoning or understanding . The act of thinking includes, among other activities, the activities of reasoning and understanding. While reasoning, we seem to use, among other means, various types of deduction, e.g. analogical, logical and probabilistic. The act of understanding, among other features, seems to include a certain sort of upright arrival at (and a certain sort of illumination with respect to) an aspect of an existing truth. In his essay, \Was Heit Denken", Martin Heidegger [Heid54] expounds on two major facets of our intellectual activity, one of a primarily computational nature, the other of a primarily contemplative nature. If we take Hiedegger's observations about out thought activity as a starting point for de ning what makes a thinking agent perfect, we will conclude that computational agents are imperfect regardless of the existence or the nature of any computational constraints or powers. Accepting a philosophical stance similar to Heidegger's and beginning with ourselves as agents capable of contemplative as well as computational thought activity, we may then characterize all computational agents as imperfect , even if we assume the existence of a perfect computational environment of in nite speeds and memories. However, it may be more prudent to use Hiedegger's bisection of our 2

3

2 It is no accident that the one of the most important and productive elds in the study

of arti cial intelligence has been named \machine learning". 3 Note that this characterization does in no way imply that agents capable of computational as well as contemplative thought activity are perfect. It simply implies that computational agents seem to lack the ability to perform a certain type of thought activity. Although there are agents, say ourselves, who are capable of a certain level of computational as well as contemplative thought activity, these agents can su er imperfections in both types of activity.

5

thought activity in order to characterize some agents as capable of purely computational thought activity while characterizing others as impure in this respect. In so as our own thought activity includes a level of contemplation, we are then impure computational agents, with the level of impurity determined by the extent to which we engage in contemplation. The goal of this paper is to focus on the former, purely computational type of agents.

3 Logical Foundations A mode of characterizing and constructing computational agents is to devise suitable computational logics in order to capture the main features of various higher level models of computational (as opposed to contemplative) reasoning. Any logic is composed of three main components. First, we have the language. The alphabet is used to form words or expressions. Expressions are either predicates themselves or are put together using logical connectives and other expressions. The second component of any logic is its rules of inference. From classical logic, we are familiar with rules such as modus ponens and generalization. In a logic of probability, for example, we will be more interested in inference rules that mirror axioms of probability. Last but not least, the third component of any logic is the set of consistent interpretations attached to its language. (For a general accessible discussion of the fundamental concepts in logic see [Tar94].) In their writings on the foundations of logic, Ebbinghaus, Flum, Wright [Ebbi90], Goldstern and Judah [Gold95] have presented thorough discussions of these three main components of logic. Manin [Mani77] has written a highly readable and incisive book analyzing not only the main components of logics but also discussing more advanced topics in logic. The Handbook of Mathematical Logic is still the quintessential reading in logic [BaKe77]. This paper only assumes that the reader is familiar with the foundamental notions discussed at the beginning of this section.

6

4 Probability Logics In this paper, we propose to pay particular attention to a group of computational logics which have come to be known as probabilistic or probability logics because the last 20 years of AI research has come to signify the necessity of coming to grips with models for reasoning in an uncertain environment. The scholarly literature in this area has experienced a substantial growth [RRRRRRR], major AI meetings have been devoted to the study of reasoning in the presence of uncertainty [RRRRR] and pioneers such as Lot Zadeh have now gained a foothold in the battle to convince the hard core AI community to pay greater attention to uncertainty and what has come to be known as fuzziness [RRRRRRRR]. 4.1

Lukasiewicz's Logic

Lukasiewicz's original work on probability logic [Luk13], published in 1913, provides the rst comprehensive, and rather successful, attempt to marry logic and probability . In fact, his interest in exploring the logical foundations of probability theory proved to be rather productive, as we shall examine in this section. His ideas not only prove original but anticipate much of what followed years later [Halp90]. 4

4.1.1 Inde nite Propositions and Truth Values

Lukasiewicz's approach to the logical foundations of probability subsumes probability under logic by a broad de nition of certain elementary logical concepts such as truth value and assignment. 4 It is an odd fact in the history of probability logics that the seminal work in this area

has received little attention by researchers in the eld. It is surprising that those who have followed in Lukasiewicz's footsteps in at least some of their papers [Halp90], have made little or no reference to Lukasiewicz as the source of their inspiration, one glaring example being Halpern's analysis of rst-order logics of probability, where he has developed a very natural extension of Lukasiewicz's notation, in fact making very direct use of Lukasiewicz's very notation and keeping close to Lukasiewicz's original ideas, without making any reference to \Logical foundations of probability theory," Jan Lukasiewicz's original paper. Yet others, in particular N.J. Nilsson, who have put forth views diametrically opposed to those of Lukasiewicz have made honorable references to him [Nil86].

7

According to Lukasiewicz's view, one can only assign probabilities to inde nite propositions, and that when one assigns a probability to an apparently de nite proposition, one is really assigning a truth value to some inde nite proposition. For example, when we assign the probability of 1=2 to the proposition \The toss of the coin results in heads," we are in fact assigning a truth value to an inde nite proposition, say \The x-th toss of the coin results in heads," in a world where we have a xed number of coin tosses with a xed result attached to them . Let us rst x before us what Lukasiewicz means by inde nite propositions. \I call inde nite those propositions which contain a variable," says Lukasiewicz . According to this de nition of inde nite propositions, all of the following propositions are inde nite: 5

6

\x is greater than four" \x is a ying bird" \x is greater than six"

(1) (2) (3)

Let us rst see how we can assign truth values to the following de nite propositions. \For all x, x is greater than four" \For all x, x is a ying bird" \For all x, x is greater than six"

(4) (5) (6)

5 In a sense, in Lukasiewicz's view, the resources of the domain remain constant. In the

vocabulary of logic, the model's elementary diagram gives a substantial characterization of the logic. The model is xed. This xity also explains the use of the present tense in probabilistic propositions. It agrees with the general notion of a predetermined world, with the whole world existing in the present at all times. In this sense, a probability statement is not really a predictive statement about the likelihood of a future, as yet non-existent event, but rather a truth value statement about the existent pre-determined world. The act of assigning a truth value to an inde nite proposition may, in a given nite model (world), be an easy consequence of a nite counting. In any in nite model (world), the assigning of truth values to inde nite propositions may in practice become an act of conjecture in the absence certainty and omniscience. 6 In the examples that follow, I believe I have closely followed Lukasiewicz's presentation. The reader may suspect a high level of corruption of Lukasiewicz's language by my more modern logical vocabulary but in fact all the ideas explored in the following examples are there in Lukasiewicz's paper [Luk13]

8

To the novice, who will interpret equations (4) to (6) in th world we live in, all those statements are false simply due to the fact that not all beings are birds or numbers. To the student of logic, however, it is immediately clear that the truth of such statements depend on the context of discussion , or to be more precise, on the values permissible for the variable x in each one of the propositions. For example, if permissible values for x are selected from among 1, 2, 3, 4, and 5, proposition (4) will be true, and propositions (5) and (6) false. While if x is chosen from among wild ducks, peagons and blue birds, propositions (4) and (6) will be false, and proposition (5) will be true. What matters, in distinguishing propositions (4) to (6) from propositions (1) to (3), is that no matter which set of permissible values are selected (i.e. no matter which interpretation is chosen) propositions (4), (5) and (6) will either have a truth value of 1 (i.e. they will be true, or have a probability value of 1) or have a truth value of 0 (i.e. they will be false, or have a probability value of 0). Assigning truth values to propositions (1) to (3), however, requires a di erent approach. Either, as is done in classical model theory [ChKe90], we use an assignment function which takes each variable of our language to a unique element in the underlying set of in order to preserve binary truth assignments, or we allow non-binary truth (or probability) assignments and arrive at partial truth (probability) for each one of the propositions given in propositions (1) to (3). Partial truth values are truth values which range between 0 and 1, including the limits. For inde nite propositions with a single free variable, we shall assign truth according to the proportion of domain (i.e. underlying set) members which satisfy the inde nite proposition . This was the path taken by Lukasiewicz in his logical foundations of probability theory. To x our ideas, consider the following domains 7

8

1,2,3,4,5

(d1)

7 To speak in modern terms, the truth value of such statements can only be assessed when one xes a particular interpretation to the particular language of which they are a proposition. 8 Such a procedure for computing truth values will sooner or later demand a discussion of proportionality in non- nite domains. Such a discussion often leads to the measuretheoretic approaches to probability. So, to be more precise, we shall have to state the de nition of truth values given above using measure-theoretic concepts. However, at this point our primary intention is to follow Lukasiewicz's approach to the problem.

9

BlueBird , Penguin , Penguin 1

1

1 , 2 , Penguin

2

2

(d2) (d3)

Propositions (4), (5) and (6) are false in all these domains whereas the indefinite propositions (1), (2) and (3) will have partial, i.e. probabilistic, truth values. The inde nite proposition (1) will have a truth value of 1=5 in the domain (d1) and 0 in domains (d2) and (d3). The inde nite proposition (2) will have the truth values of 0, 1=3 and 0 in domains (d1), (d2) and (d3).

4.1.2 Lukasiewicz's Analysis of Implication, and the Deductive Principles of His Logic

Based on his analysis of probabilities into partial truth values, Lukasiewicz's considerations of the semantics of logical implications led him to what he called the "Principles of the Calculus" (i.e. the inference axioms) of his logic for probabilistic reasoning. A list Lukasiewicz's principles follows. (' = 0) = [w(') = 0] (' = 1) = [w(') = 1] (' ) ) ) [w(') + w(:' ^ ) = w( )]

(7) (8) (9)

Note that the statement of these principles implies that Lukasiewicz has a kind of two-sorted logic in mind although this is a point he never explicitly makes clear in the entirety of his paper. In fact, the confusion arises because Lukasiewicz has assigned multiple meanings to his symbols 0, 1, and =. That these symbols mean di erent things in the realm of propositions and that of truth values should be clear to any astute reader. In the realm of propositions, 0 is the proposition which when \multiplied" (i.e. when \^"-ed) with any other proposition results in a proposition equivalent to itself. In other words, 0:' = 0, or in a more modern notation, 0 ^ '  0. In the realm of truth values, 0 is equal to the truth value of all propositions equivalent to the 0 10

proposition. It is this relationship between 0 in the realm of propositions and 0 in the realm of truth values which principle (7) captures, with the caveat that the rst two occurances of the = sign should be replaced with a di erent symbol, say  ('  0p)  [w(') = 0t] (10) where we have added subscripts to emphasize the di erence between 0 in the realm of propositions, 0p, and 0 in the realm of truth values, 0t. Similarly, the second principle (8), which captures the relationship between 1 in the realm of proposition , and 1 in the realm of truth values . Rewriting the second principle in a more precise notation results in ('  1p)  [w(') = 1t] (11) Will the interpretation of the thrid principle lead to any further modi cations of Lukasiewicz's notation? The intention of the third principle is to capture a simple principle which can be expressed more simply in the language of set theory if we rewrite the proposition as (:' _ ) ) [w(')+ w(:' ^ ) = w( )]. Now, Lukasiewicz makes his crucial contribution. He notes that as we vary the assignment of the variables in (:' _ ) (i.e. the variables in (' ) )), we can get a de nite, true proposition subject to three necessary and sucient conditions. The First condition obtains when the proposition ' assumes a truth value of 0 because it states falsities under all assignments to its free variable. The second condition holds when the proposition assumes a truth value of 1p because it states truths under all assignments to its free variable. Under both of these conditions, the implication (' ) ) will hold. Assuming that the implication holds but that the two conditions just mentioned fail, we can arrive at the thrid possibility. In the third possibility, there are free varialbe assignments in ' which lead to its truth, and there are free varialbe assignments in which lead to its falsity. If the free variable occuring the ' was di erent from that occuring in , once could always select as assignment which will render ' true and false. So, we conclude that for the implication (' ) ) to be true, not only must the free variables in 9

10

9 In the realm of propositions, 1 is the attractor for the _ connective. 10In the realm of truth values, 1 is equal to the truth value of all propositions which can

act as the attractor of the _ connective.

11

' and be identical, say to x, but also any assignment to this free variable which makes ' a true proposition must also make a true proposition . 11

To summarize Lukasiewicz's analysis somewhat more formally we may state: (Def1)

(' ) ) has the truth value of 1p i

\' is falsi ed by all assignments to its free variable" or \ is veri ed by all assignments to its free variable" or \' and must have the same variable x, and all assignments to x which verify the reason ' must also verify the consequence " This summary should help make it clear that if (' ) ) has the truth value of 1p , we may they conclude, from the disjunction in (Def1), that the domain of validity for ' will be a subset of the domain of validity for . Given this simple fact, Lukasiewicz's principle (9) follows. Using his three principles (7), (8) and (9), Lukasiewicz can deduce the following theorems : 12 13 14 15

w(0p) = 0t

(12)

11This astounding analysis in 1913 foresees much that will follow years later in the works

of A. Tarski in creating and clarifying the theory of models on the sound basis of structural interpretation of languages. 12The numbering of these theorems match that given by Lukasiewicz for historical purposes. 13Clearly, the implication in theorem (14) is not reversible. For example, given the domain containing 1, 2, 3, 4, 5, 6, the inde nite propositions x = 1 and x = 0 both have the same truth value while they are not equivalent. 14Theorems (19) and (21) of Lukasiewicz are each one of the implications stated in theorem (22). 15The nal theorem completes the rst series of theorems derived from axioms (7), (8) and (9). It implies the reversibility of axiom (9).

12

w(1p) = 1t (13) ('  ) ) [w(') = w( )] (14) w(') + w(:') = 1t (15) w(' ^ ) + w(:' ^ ) = w( ) (16) w(') + w(:' ^ ) = w(' _ ) (17) w(' _ ) = w(') + w( ) ? w(' ^ ) (18) (' ^ = 0p) ) [w(' _ ) = w(') + w( )] (19) (_i j ('i ^ 'j )  0p) ) [w(_i'i) = iw('i) (20) (' ^ = 0p) ( [w(' _ ) = w(') + w( )] (21) (' ^ = 0p)  [w(' _ ) = w(') + w( )] (22) (' ) ) ( [w(') + w(:' ^ ) = w( )] (23) (' ) )  [w(') + w(:' ^ ) = w( )] (24) The signi cance of theorems (21) and (23) is that they imply that based purely on numerical relations among truth values of certain propositions, it is possible to determine their logical relationship. This observation, made by Lukasiewicz in his paper, should not surprise modern logician who will note the relationship between Lukasiewicz's de nition for truth value and model theoretic considerations, such as the notion of types. 6=

4.1.3 Relative Truth Values, Independence and Bayesian Logic

A fundamental concept in probabilistic reasoning is the notion of conditionsl probabilities. Most axiomatic presentations of probability devote considerable attention to this notion, not only because of interesting theoretical questions it poses but also because of its independent practical value. In fact, useful techniques such as Bayesian probability modeling in various elds, depend rst and foremost on a precise de nition of the idea of conditional or relative probabilities.

w'( ) = w(w'(^') )

(25)

The string w'( ) is an abbreviated notation for ww'' which is a measure of the relative truth value of when we restrict our assignments to only those (

^

( )

13

)

which make ' true. To quote Lukasiewicz, \it indicates how great is the truth value of , assuming that ' is true. The de nition of relative truth values, immediately leads to two familiar theorems.

w p ( ) = w( ) w(')w'( ) = w( )w (') 1

(26) (27) (28)

The concept of independence can be de ned using the notion of relative probabilities. If the truth value of an inde nite proposition relative to another proposition, say ' is the same as the truth value of relative to :', it follows that assignments which make ' true have no greater cause to make true as do assignments which make :' true. In such a case, we say that the truth value of is independent of the truth value of '. From the de nition of the independence relationship I ,

'I  [w'( ) = w '( )] and the earlier theorems, it follows that :

(29)

16

'I

 [ w(w'(^') ) = w(w:(':^') ) ]  [ w(w'(^') ) = w(w:(':^') ) = w( )]  [w'( ) = w '( ) = w( )]  [ w(w'(^') ) = w(w'(^: :) ) ]  I'

(30)

(31) 'I 'I (32) 'I (33) 'I (34) Theorem (34) demonstrates that states that indepence is symmetric. The proofs of these theorems are rather simple and can be found in Lukasiewicz's original presentation. Two more theorems are given in that presentation. One provides the law of multiplication in probabilities, the other the law of :

16The theorem stated in theorem (30) is based on de nitions (25) and (29).

14

probability exhaustion which is the most used form in Bayesian statistical analysis.

'I  [w(' ^ ) = w(')w( )] (35) (_ixi  1p) ^ (_i j xixj = 0p) ) [wa(xm) = w(wx(mx)w)wxm ((aa)) ] (36) 6=

i

i

xi

4.1.4 Lukasiewicz's Philosophical Investigations on the Relationship of Probability and Logic 4.2

Nilsson's Logic

Nilsson's rst pulished his proposed logic for probabilistic reasoning in his 1986 paper \Probabilistic Logic" [Nil86], some 70 years after Lukasiewicz's publication of his original proposal. Unlike Lukasiewicz's work, Nilsson's approach has already become integrated into the AI theories as a logic for probabilistic reasoning. It has been featured in handbooks of logic in arti cial intelligence, [Gab94], and in AI textbooks such as [Tor95]. In a retrospective paper [Nil93], Nilsson notes that his ideas in \Probabilistic Logic" took shape while participating in the PROSPECTOR project [Dud80] with the back-drop of Dempster/Shafer formalism [Sha79] and Zadeh's fuzzy logic [Zad75]. Nilsson's main goal was \to present an intuitively reasonable but foundational account of the problem of uncertain reasoning." (Nilsson was apparently unaware of earlier, similar work done by de Finetti [DeF74], Good [Good50] and Smith [Smi61].) Nilsson discovered that when he assumed probabilistic truth values for propositions and tried to generalize modus ponens, the consequent's probability would remain under-determined. (See below for a discussion of this feature of Nilsson's generalization.) In his paper, \Probabilistic Logic" [Nil86], Nilsson takes a fresh approach to the semantics of probability statements. In the Nilssonian world, probabilities are assigned to sentences. The probability evaluation is made by a reference to possible worlds and the probabilities for actual world being some one of the possible worlds. While Lukasiewicz's probability assignment and his semantics of probability statements depends on the central notion of inde nite propositions, in 15

Nilsson's semantics, the central notion is that of possible worlds. Lukasiewicz arrives at partial truth (i.e. probability) values of inde nite propositions by considering the fraction of assignments to the free variable (in the inde nite proposition) which lead to a true de nite proposition. Nilsson arrives at partial truth values by assuming the existance of a multitude of possible worlds in each of which a de nite proposition (a sentence) may either be true or false. The actual world belongs either to the set of worlds where the de nite proposition is true or to the set of worlds where the de nite proposition is false. Of course, if the state of actual world with respect to this bisection of the possible worlds is known we shall be certain about the truth or falsity of the de nite proposition. Our uncertainty, if any arrises from the fact that we do not know for certain whether the actual world falls into the set of worlds where the de nite proposition is true or into the set of worlds where the de nite proposition is false. This uncertainty can be modeled by assigning probabilities Pt to the state of the actual world belonging to the set of possible worlds where the de nite proposition is true. So, we arrive at the following rather awkward hypothesis. When we assign truth value of Pt to a de nite proposition , we are really interpreting the state of a airs such that, with probability Pt, our actual world falls into the set of possible worlds where  is true and, with probability Pf = 1 ? Pt , our actual world falls within the set of those possible worlds where  is ase. What distinguishes Nilsson from Lukasiewicz is their stark philosohpical di erence in understanding and coming to grips with the notion of probabilities. We will have more to say on this but here we shall simply note that for Lukasiewicz the world is a given quantity of xed facts and probabilities or partial truths arise when we seek to assign truth value to propositions which are, when properly stated, equivalent to some inde nite proposition. In other words, the truth of any given de nite proposition is a matter of inspection. It may very well be the case that we shall not be well-equipped to carry out the second inspection, because it may involve examining an unmanagably large set of \de nitizations" . Our practical inabilities in carrying out such inspections does not indi17

18

17This assumption is taken to hold from the prespective of the intelligent agent. 18A \De nitization of an inde nite proposition is the de nite proposition which is ob-

tained by some assignments to the free variable in the inde nite proposition.

16

cate any inherent uncertainty, rather a kind of inherent imperfection in our abilities. The world remains xed and certain; we only fail to inspect it completely and perfectly. When we are faced with a question requiring a complete and perfect inspection, we may be led, by practical constraints, to make partial truth value judgements (i.e. make statements of probability) based on an extrapolation of an imperfect inspection into a hypothetical, perfect inspection. In Nilsson's world, partial truth becomes an inheret feature of de ntie propositions. Our imperfection is at the level of the rst kind of inspection noted above. No matter how long we inspect the world, we may still not be able to determine the truth or falsity of a given de nite propositions becuase the proposition can be true in some possible worlds and false in other possibel worlds, and the assumption of the existance of possible worlds inherently arises from teh assumption that our world could possibly be any one of these possible worlds with some particular probability in each case. In fact, we cannot state, de nitely, which one of the possible worlds represents our own world. This sort of analysis has an inherent logical aw in it. If an de nite proposition can bisect the set of possible worlds, and if the actual world is one of these possible worlds, then the actual world must fall into one of these two sets of worlds, i.e. the proposition must either be true or false in the actual world. But, according to Nilsson's approach, we are uncertain as to which subset of the set of possible worlds the actual world belongs to. This uncertainty is expressed in our assigning probabilities to each of the two state of a airs. Here, we have a subjective theory of probabilities. In Lukasiewicz, we have an objective theory of probabilities. This dichotomy has been noted by other philosophers of probabilistic reasoning.

4.2.1 Carnap's Views on Reasoning with Probabilities

Here, we shall give a brief discussion of Carnap's views on logics of probability. Although Nilsson makes no signi cant use of Rudolf Carnap's work on the logic of probabilities, probabilistic reasoning and decision theory, Carnap's general in uence on modern writings can not be denied. Carnap stated, modi ed and re ned his ideas on the logical foundations of probabilistic reasoning in a large body of work, the seminal of which is his Logical Foundations of Probability published in 1950. In his writings, Carnap sought to give precise 17

meanings to various notions central to his theories|notions such as subjective value of a possible act, rational decision theory , normative decision theory , statistical probability , personal probability , actual degree of belief , rational degree of belief , actual decisions , rational decisions . For a brief exposition of Carnap's views on these concepts , see [Car71]. Here, we shall give a quick review of these concepts as they bear on drawing distinctions between Nilsson's and Lukasiewicz's proposals. In [Car71], Carnap begins by pointing to the pragmatic goal of the scholarly thinking on probabilities|decision theory. According to Carnap, decision theory involves the concept of \utility" and \probability." He argues that, in the context of decision theory, we must think of the notion of \probability" not as \relative frequency" but rather as \degree of belief". Note that Lukasiewicz's views can easily be misconstrued as being based on what Carnap sees to be a \relative frequency" foundation of probabilities. However, as we saw earlier, Lukasiewicz had already distanced his views from both \relative frequencies" and \degree of belief". Lukasiewicz believed that he had solved the foundational enigma of probability theory not by just showing concepts such as \relative frequency" versus \degree of belief" to be inadequate but also by demonstrating that the notion of probabilities simply arises because we seek to associate truth values to inde nite propositions. It is Lukasiewicz's solution for assigning such truth values that may make his position to appear to be one that bases probabilities on \relative frequency" . But his partial truth assignments require not simply relative frequencies but rather a nitistic semantics of truth. Nilsson's proposal seeks to solve decision theoretic problems, and as with Carnap, he sees the concept of \degree of belief" to be more constructive for this theory. Carnap notes that \degree of belief" is a \psychological concept, refering to actual beliefs of actual human beings." Carnap's view, here, nd reverberations in modern works as we shall see in our analysis of how Nilsson has formulated his logic of probabilistic reasoning. In Nilssonian logic of probability, probabilities come into play to model our uncertainty about the actual world. (With some probability Pi, the agent believes that its actual world belongs to some subset of L{consistent worlds, Wci .) 19

19Lukasiewicz does not even use terms such as \relative frequncy" or \degree of belief".

18

In decision theory, a person/agent  at a certain time  has to make a choice among possible acts , , : : :. Here,  knows that the possible states of nature at time  are W , W , : : : . But  does not know which of these is the actual world. In this simple presentation of Carnap's ideas, we take the number of possible states and the number of possible acts to be nite . According to Carnap's assumptions,  has the foreknowledge that action m in the state Wn will lead to the unique outcome m;n . We can also assume that  has a utility function U and that the agent  knows its utility function so that it can use the function to compute the value of an action at time . The utility value of an action de nes the desirabilitly of that action: 20

1

2

1

21

2

22

23

v; ( m ) =

X n

[U (m;n )P (Wn)]

(37)

Here, P (Wn ) is the probability that our world will be in state Wn at time . Carnap takes this valuation to be the subjective value or desirability , for , of taking action m at time . This value is simply the expected utility of the outcome of m . The Baysian rule of decision making says that 24

20The terms \person" and \agent" have often been given quite di erent meanings. So, the question arises as to why they are both used here, somewhat interchangeably. The simple answer is that in most decision theories as well as most everything else that uses utility theory, the person and the agent are treated as one, with the agent's unique utility function masquerading as the agents personality. See [Sma87] for more on how close utilitarianism comes to equating persons with (utilitarian) agents. 21These are what we shall call the set of \L"{consistent possible worlds for our set of L sentences. 22\Worlds" in our vocabulary. 23As we shall see later, this nitistic assumption (also followed by Nilsson) will save us the extra trouble of forming consistent subsets of possible worlds. The assumption of in nity will also lead to in nite truth value matrices in the Nilssonian formulation and will require more advanced mathematical tools in order to deal with in nite matrix multiplications, in nite semantic trees, etc. 24Coming from an intellectual background infatuated with quantem mechanics, and perhaps motivated by Heisenberg's uncertainty principle, Carnap makes some parenthetical remarks here: If the situation is such that the probability of Wn could possibly be in uenced by assumption that act m were carried out, we should take the conditional probability P (Wn j m ) instead of P (Wn ).

19

Choose an act among ; : : : so as to maximize the value of v; ( m ). In other words, for person , 1

2

opt ; = max m fv; ( m )g

(38)

Carnap notes that the optimality condition (38) is interpreted in two distinct ways: Decisions made are some opt ; or A rational decision consists in the choice of an act of the type opt ; While the rst interpretation belongs to a descriptive decision theory , the second interpretation belongs to a normative decision theory . A descriptive decision theory can be part of a psyhological theory which seeks to give a description of agent{acts. A normative decision theory simply states conditions of rationality for decisions. In conjunction with the main goals of this paper, we shall focus not on Carnap's discussion of normative/descriptive distinctions, but on his analysis of probability assignments P (Wn ). He states that within the context of decision theory, we must see which conception of probability is more appropriate and adequate to valuations such as the one in equation (37). Carnap notes that the main conceptions of probability can be divided into statistical (objective) or personal (subjective) . According to Carnap, the concept of statistical probability is closely connected to \relative frequencies in mass phenomena" useful in the investigations of statistical mathematics 25

Except for a minor correction, this quote is directly from carnap [Car71]. 25Carnap writes As I see it, these are not two incompatible doctrines, but rather two theories concerning two di erent probability concepts, both of them legitimate and useful This quote is from [Car71].

20

and physics as well as in the empirical sciences. However, personal probabilities is assigned to a proposition or event, e, by a person or agent, , based on the degree of belief which  has in e As statistical probabilities are not known to an agent , it consequently becomes rather dicult to use such probabilities in either a normative or descriptive decision theory both of which make certain assumptions regarding 's cognizence of P (Wn) in equation (37). One may then draw a useful distinction between actual and rational degrees of belief to support theories of actual and rational decisions, as Carnap suggests. However, allowing such a distinction (as Carnap seems to advocate), may lead to a non-refutable, descriptive decision theory, turning pshychological theories which make use of it into closed systems, far from Carnap's requirements for a scienti c theory. We leave this short analysis of Carnap's views and return to a discussion of Nilsson's proposal for logical foundation of probabilistic reasoning, simply noting that Carnap's views have relevance to a theory of probability logic based on degrees of belief.

4.2.2 Possible Worlds and Probabilities

Here, we present Nilsson's logic of probabilities. A de nite proposition, or sentence, S can be either true or false. We can imagine two sets of possible worlds, Wt(S ) in which S is true and Wf (S ) in which S is false. Our actual world, Wa must be in one of the two sets, but we may not know which. We model this uncertainty by assigning the truth value of Pt to WaWt(S ). In Lukasiewicz's language, we may interpret the above statement as

w((  W) ^ (8(  ) j= S ) ^ (Wa )) = Pt

(39)

With L sentences, we may have up to 2L sets of possible worlds. However, there may be fewer sets of possible worlds because some combinations of truth assignments to our L sentences may be inconsistent and no possible world could satisfy such combinations of truth assignment. For exmaple, let us consider the sentences P , P ) Q and Q. There are only four combinations of consistent truth assignments to these sentences, namely: h1; 1; 1i, h1; 0; 0i, h0; 1; 1i, h0; 1; 0i. The other four truth assignments, which are combinatorically available, are ruled out. Assignments such as h0; 0; xi are ruled 21

out becuase when P is false, P ) Q cannot be false. Assignments of the form hx; 0; 1i are ruled out because when Q is true, P ) Q cannot be false. To determine the set of all valid assignments for a set of given sentences (i.e. de nite propositions, in Lukasiewicz's terminology), Nilsson presents the method of developing a binary semantic tree. This method is discussed thoroughly in [Nil86]. The signi cant point is that the method produces an algorithm for determining all valid combinations of truth assignments. The general algorithm can be summarized as follows. Becuase at the n-th level in the tree our binary branching occurs according to the two available truth assignments to the the n-th sentence, each path in the semantic tree corresponds to a unique assignment of truth values to all our sentences. Having arrived at a certain level, we close all branches along paths that have generated an inconsistent set of assignments. Let us assume that with some algorithm such as the one given in Nilsson, and mentioned above, all consistent turth assignments have been identi ed. For each consistent combination of truth assignments ci =< a i; :::; aLi >, where a i to aLi are either 0 or 1, we have a corresponding set of possible worlds, Wc (S ; :::; SL), in whose members, the combination ci represents a valid truth assignment. The sets Wc (S ; :::; SL) for all consistent combinations ci of truth assignments to the L sentences, give a sample space of worlds, over which a probability distribution can be de ned for our actual world. In other words, our actual world is a random variable which can take on any of the values in our sample space of possible worlds in conformity an consistent with all consistent truth assignments to our L sentences. The probability distribution (over the sample space consisting of all possible worlds) speci es the probability Pi that the actual world belongs to the set of possible worlds Wc (S ; :::; SL). With this scheme of extracting the set of all possible worlds from the set of all consistent combinations of truth assignments to a given set of sentences, actual truth evaluatin for that set of sentences reduces to assigning a belief measure (a probability distribution) to our actual world against the sample space of all possible worlds. In other words, the truth evaluation for the set of sentences is simply extracted from the belief measure about the state of the actual world. The truth values for the sentences are extracted from a matrix multiplication of all consistent set of possible assignments by the probabilities of our world belonging to the set of possible worlds satisfying each of those consistent set of assignments. The belief measure providing 1

1

i

1

i

i

1

22

1

those probabilities becomes the missing link between combinatorics and matrix multiplication. Let's turn from this verbiage to symbolism that may be suitable for the more algebraically minded. If we perform the following matrix multiplication, 0P 1 0 a a ::: a 10 P 1 CC BB a a : : : a kk CC BB P CC B  P B c1 c2 : : : ck BB@ ... CCA = BB@ ... ... . . . ... CCA BB@ ... CCA Pk a a : : : aLk Pk 0 vL 1 L BB v CC (40) = B B@ ... CCA vL we shall obtain the truth value assignments for each of the sentences S , : : :, SL. The j -th row (aj , : : :, ajk ) in the the combination matrix gives the truth value vector for th j -th sentence in each of the k ( 2L) sets of possible (L{consistent ) worlds. The equation (40) written below in index notation 1

11

12

1

1

2

21

22

2

2

1

2

1 2

1

1

26

27

vj = ajiPi (41) states that the truth value, vj , of a sentence, Sj , is determined by the weighted average of the truth value of that sentence in the i-th set of possible (L{consistent) worlds, with the probability weight, Pi that our actual world belongs to that i{th set of possible worlds.

4.2.3 Entailment and Reasoning with Uncertain Beliefs

Exposition of any logic must include a section on entainlment, implication or reasoning with that logic. While discussing Lukasiewicz's Logic, we presented

26By L{consistent set of worlds, I mean the set of all worlds which assign the same truth values to our L sentences. 27Einstein's summation notation has been used, i.e. summation is assumed for all terms over repeated indices in each term. Here, i is the repeated index in the term on the right hand side. So, summation is assumed over the range of i on the right hand side. Note that in our expression, 1  j  L and 1  i  k  2L.

23

his treatment of logical implication in section (4.1.2). We will now turn to Nilsson's treatment of reasoning with his logic of uncertain beliefs. Nilsson's analysis of entailment will then be compared with Lukasiewicz's analysis. In his retrospective paper, [Nil93], Nilsson notes: \The key intellectual contribution of `Probabilistic Logic' was a formal procedure for calculating the bounds on the probability of a sentence in the predicate calculus given the probabilities (or the bounds on the probabilities) of other sentences." This is a process which Nilsson calls \probabilistic entailment." Given that sentence SL is derived from some subset of sentences in our collection, we may add it to our collection and get the following matrix equation, which gives the rst constaint on vL , the probability value for SL . 0 a 0P 1 a ::: a k 10 P 1 BB a  BBB P CCC  a ::: a k C CC BBB P CCC B c1 c2 : : : ck B@ .. CA = B@ .. ... . . . ... C A B@ ... CA . . Pk 0 a a : : : aL ;k Pk 0 Lv ; 1L ; BB v CC = B (42) B@ ... CCA vL + 1 where k0 is equal to the new number of possibel sets of (L + 1)-consistent worlds Wc (S ; : : :; SL; SL ) where the column vector ci gives the truth values for S ; : : :; SL in all the \possible" worlds in the set Wc (S ; : : : ; SL; SL ). Note that W = [i ;k Wc (S ; : : : ; SL; SL ) will produce the set of all possible worlds becuase the union is taken over all possible, consistent assignments to the sentences S ; : : :; SL o ered by the assignment vectors ci. The following equation, which states plainly that our world must belong to W = [i ;k Wc (S ; : : :; SL; SL ) with a probability of 1, imposes a second constraint on entailment. X Pi = 1 (43) +1

+1

+1

1

11

12

1

0

1

2

21

22

2

0

2

+1 1

+1 2

+1

0

0

0

1 2

i

1

+1

1

+1

=1

i

i

1

i

1

+1

+1

1

=1

1

+1

+1

i=1;k

0

In extreme cases, P can take on the value of unit vector ei in the k0dimensional space, i.e. our world can be believed to belong to Wc (S ; : : :; SL; SL ) i

24

1

+1

with a degree of belief equal to certainty (with the probability of 1), but not to Wc = (S ; : : : ; SL; SL ). j 6

i

1

+1

4.2.4 Nilsson's Cube and the Convex Hull of Possible Worlds

The analysis in the previous paragraph makes a simple use of constraint (43) and motivates the following more concrete analysis, Note that we have k0 consistent assignments of the form ci. These assignments form vectors pointing at some of the 2L vertices of a unit cube, C, in the (L +1)-dimensional space of our L +1 sentences. The totality of the 2L vertices of this (L + 1)-dimensional cube, C, give all truth assignments to the sentences S ; : : :; SL , both consistent and inconsistent assignments. Only k0 of these assignments (vertices) are consistent truth assignments realizable in some possible worlds. These consistent truth assignments (vertices) are denoted by ci with i = 1; : : : ; k0. The analysis above provides us with the needed concepts in assigning a \truth value," probability value, or degree of belief to the SL sentence. Note that since this sentence is composed of the previous Si , i = 1; : : :; L, sentences, its truth value can be deduced (\composed") from the truth values of the previous L sentences. However, we will show, following Nilsson that only a range of possible truth values can be deduced for sentences such as SL which are composed of the previous L sentences. Now, as was said earlier, some of the vertices of C coincide with the end points of the vectors ci, i = 1; : : : ; L + 1. We can call this subset of the 2L vertices of the unit cube C the \possible world" subset of vertices and denote them by the symbol V. The convex hull of V forms a region within the (L + 1)-dimensional unit vector. Call this region the \possible world" region, and denote it by R. All truth value (probability, degree of belief) assignments to the sentences S ; : : :; SL must lie within R  C. Note that the  relationship here is strict because once SL is composed of the earlier L sentences, its truth assignment is constrained by the other truth assignments. +1

+1

1

+1

+1

+1

+1

1

+1

+1

4.2.5 A Truth Value Range for Entailed Sentences

Having found the probabilistic truth value of the the rst L sentences by means of solving equations such as valuation (40) for vi, i = 1; : : :; k, we can 25

use geometry to compute the range of probabilistic truth value that can be assigned to SL by going to the point v = (v ; v ; : : :; vn; 0) on one of the faces of the C. We can then consider a line whose parametric equation, in terms of parameter t[0; 1], is given by +1

1

2

x = v x = v ... ... ... xL = vL xL = vL 1

1

2

2

+1

(44) (45) (46) (47) (48)

+1

Note that v ; : : : ; vL are already known values. Equation (48) will have to be solved subject to the constraint that the solution points should lie within the \possible world" region R. Since this region is a convex hull of many of the vertices of C it is highly unlikely that it will be a plane. Thus, only a range of solutions will be possible for the truth value of SL . 1

+1

4.2.6 Summary of Nilsson's Analysis of Entailment

To summarize our analysis above, in Nilsson's proposed logical foundation of probability reasoning, given L probability assignments v ; v ; : : :vL to L sentences S ; S ; : : :SL, we can only expect to nd a bound (not a precise value) for teh (L + 1)-th sentence SL whose truth value may be deduced from some subset of S ; S ; : : :SL . 1

1

2

2

1

2

28

+1

4.2.7 Entailment Interpreted

Probabilistic entailment receives such wildly di erent interpretations in NIlsson's and Lukasiewicz's works becuase the two authors have wildly di erent for probabilistic reasoning. Nilsson begins with the \probability" truth values for L em sentences. These values are based (through valuation (40)) on our degree of belief regarding the probability of our world Wa belonging to 28For example, SL+1 may be composed of some subset of the earlier sentences by means

of logical connectives.

26

some set of L-consistent \possible worlds" Wc (S ; : : : ; SL). He then derives a bound on an \entailed" sentence SL . Lukasiewicz, in his analysis [Luk13], derives the following theorem 29

i

1

+1

(' ) )  [w(') + w(:' ^ ) = w( )] (Theorem 13 ) The inde nite proposition ' implies the inde nite proposition , iff , the probability (truth) value w(') assigned to ' is equal to the probability (truth) value w( ) assigned to the inde nite proposition , minus the probability (truth) value w(:' ^ ) assigned to the inde nite proposition :' ^ . The emphasis here is to point to the completelydi erent approach Lukasiewicz takes. He only assigns probability (truth) values to inde nite propositions, where as Nilsson assignes them to sentences . For Lukasiewicz, speaking about probabilities arises because we wish to assign truth values to indefinite propositions. For Nilsson, we speak in uncertain terms due to the uncertainty in our belief about the precise state of our world. We are uncertain as to which one of the subsets of the set of all \possible worlds" our world belongs . For Lukasiewicz, probabilities are partial truths. For Nilsson, probabilities are due to the uncertain nature of the actual world. This may be a simplistic comparison of the two prespectives, but it underlines the fundamentally di erent results obtained in the two systems and emphasizes the importance of work carried out to fuse the two models in works such as Halpern's [Halp90]. 30

31

4.2.8 Interdependence of Degrees of Beliefs in Possible Worlds and (Partial) Truth Values Nilsson shows that once probabilities P (' ) ) and P (') are assigned to

propositions ' and , we could obtain the following range on the probability P ( ) of proposition :

P (') + P (' ) ) ? 1  P ( )  P (' ) )

(49)

29It may be more proper to think of this \entailed" sentence as one whose truth value is constained by all others due to the fact that it is composed (by logical connectives) of the others. 31How each of these subsets are de ned has already been discussed earlier. \Possible worlds" are collected into L-consistent subsets due to a consistent truth assignment to a set of L sentences.

27

Lukasiewicz's comparable result is 24: (' ) )  [w(') + w(:' ^ ) = w( )]

(50)

Note that semantically P (:) is a very di erent mapping from w(:). However, for the purposes of this discussion, ignoring the technical di erences, we see that if ' ) is taken to be true, expression 49 will reduce to

P (')  P ( )  1

(51)

which is a weaker constraint than the right hand side of 50, which is equivalent to the assumption of truth for ' ) . We see that Lukasiewicz's theorem, in case ' ) is true gives a stronger result than Nilsson's approach. The di erence has to do with the fact that for Lukasiewicz, the propositions receiving partial truths are inde nite, and therefore, he can say more about their relationship. (See Def1 above.) Expression

4.2.9 Conditional Probabilities 4.3

Halpern's Logic

Joseph Halpern's program to desctibe and characterize logics of probabilities springs from what one may call the instantiation problem. We face this problem when we seek to move from probabilities of general statements such as \The probability that all birds can y is more than 0.9" to particular statement such as \The probability that Tweety can y is greater than 0.9." First, let me describe the useful application of such probabilistic reasoning. In insurance and credit evaluations, one often appears to be using such reasoning to estimate, from probabilities of general propositions, the death and default probabilities for a particular individual. In such cases, there are certain propositions such as \(all ) men are mortal" and \(all) wealth is 32

32The \all" in this and the next proposition have been put in paranthesis becuase

the same propositions can be stated in the inde nite Lukasiewicz form with the same probabilistic semantics. The inde nite proposition \if x is a (man or woman) then x is mortal" has a probability of 1, in the sense of Lukasiewicz.

28

nite" which hold with a probability of 1. In the bird example given below, there's a subtle problem lurking in the fact that the inde nite proposition \if x is a bird, x can y" has a probability which is less than one. Therefore, from such a statement one could not deduce any reasonable probability statement about birds. To emphasize the issues here, note that simply becuase a particular man or woman is mortal with a probability 1, then the probability that that man or woman will die after a certain number of years can (and after some threshold of time, must) be non-zero. If an animal is a bird, on the other hand, there is no guarantee that it will y. So, there is always a non-zero probability that a particular bird cannot y. Halpern's instantiation problem, as its solution, demands an answer to following type of question : From P ('(x)) = p, how can we deduce P ('( )) = q where x is a variable, P the probability (truth value) of propositions, a particular individual, p and q particular probabilities. 33

4.3.1 The Logical Language

There is a great deal of similarity between Halpern's notation and that of Lukasiewicz, as noted above. The extensions made by Halpern to Lukasiewicz's notation are natural ones. Consider an inde nite formula such as Son(x; y), which is the \Love" predicate|begin true i x is son of y and false otherwise. Using the w truth value operator introduced by Lukasiewicz we can bind one or two of the variable and extract probability truth values . Therefore, wx(Son(x; y)) gives the probabilistic truth value that a randomly chosen x the son of y. Halpern's notational contribution here is that he realizes that the index x is needed in wx. He also extends this simple idea to incorporate ordered indexes. So, w x;y >(Son(x; y)) is the complete probabilistic truth value for the inde nite proposition Son(x; y); it gives the probablility that a randomly chosen pair hx; yi satis es the relationship \x is the son of y." Given this notational choice, Halpern de nes a two sorted language com34

35

h

i

33The question has been cast in terms of Lukasiewicz's inde nite propositions. 34Hereafter, we shall call this operator the Lukasiewicz operator. 35Lukasiewicz uses singular binding, implicitly, in his paper. See [Luk13]. He also calls

the result of applying this operator on inde nite propositions, the \truth value." I have chosen to call this result the \probabilistic truth value" at times and \truth value" at other times.

29

posed of  THE FIRST SORT: 1. A set  composed of predicate and function symbols for various arities, with function symbols of 0 arity referring to constants. 2. A countable set of object variables , xo : : :yo, which can appear in terms that are \predicated" by the predicates in  or are made arguments to functions in .  THE SECOND SORT: 1. Two eld constant symbols: 0 and 1 representing corresponding real numbers. 2. Two binary relationships: = and