Lecture 21: Probability Theory - Jonathan Livengood

6 downloads 114 Views 706KB Size Report
Personalism. Suppose there is a magician. Ricky Jay. Suppose that Ricky always carries two coins in his pocket. One is fair and one is double-headed.
Probability Theory Interpretations of Probability

Business Homework #9 is due. Homework #10 will be posted by the end of the day and is due on Monday, November 18.

Interpretations Probability is a measure.

Interpretations Probability is a measure.

But what is it a measure of?

Interpretations Lots of things satisfy the axioms of probability: Mass Length Surface Area Volume

Interpretations Lots of things satisfy the axioms of probability: Mass Length Surface Area Volume

But these things don’t do the conceptual work that probability does. They are not good models of the axioms.

Interpretations Interpretations of probability answer the question of what it is that probability measures. Here are three popular views:

Relative Frequency Evidential Value Degree of Belief

Interpretations Frequentist Interpretation

The probability of an event is the frequency with which the event occurs in a sequence of experiments.

Interpretations Evidentialist Interpretation

The probability of a sentence is the degree to which the sentence is supported by some evidence.

Interpretations Personalist Interpretation

The probability of a sentence is the degree of belief one has that the sentence is true.

Interpretations Let’s look at the interpretations of probability one at a time.

Frequentism

The probability of an event is the frequency with which the event occurs in a sequence of experiments.

Frequentism The frequentist interpretation of probability was first developed in the 19th century, but it really came into its own with the work of Fisher and von Mises in the 20th century.

Frequentism Finite frequentism begins with the idea that probability is an objective, observable property in the world. Probability just is the frequency with which some event occurs in a sequence of experiments.

Frequentism The probability of rolling a seven on a pair of dice is just the frequency with which seven comes up in repeated rolls.

Frequentism The probability of rolling a seven on a pair of dice is just the frequency with which seven comes up in repeated rolls. Suppose I have thrown the dice five times, and I’ve seen 4, 3, 8, 12, and 5.

Frequentism The probability of rolling a seven on a pair of dice is just the frequency with which seven comes up in repeated rolls. The frequentist says the probability that the dice come up 7 on my next throw is zero.

Frequentism The probability of rolling a seven on a pair of dice is just the frequency with which seven comes up in repeated rolls. Suppose I throw the dice a thousand times, and 7 comes up 165 times.

Frequentism The probability of rolling a seven on a pair of dice is just the frequency with which seven comes up in repeated rolls. Then the probability of 7 on my next throw is 165/1000 = 33/200 ≈ 0.17

Frequentism Many people find this finite frequentism unsatisfying for one reason or another. Relativity to universe of discourse. Local determinism. Probability in empty universes. Explanatory weakness. Spurious values and relations.

Frequentism Some such people turn to hypothetical frequentism, instead. The probability of an event is the limit of the relative frequency with which it would occur in an infinite sequence of experiments.

Frequentism Hypothetical frequentism has its own problems.

Perhaps the most important is that it gives up on the close connection to the outcomes of actual, observable experiments.

Evidentialism

The probability of a sentence is the degree to which the sentence is supported by some evidence.

Evidentialism Laplace may have been the first evidentialist, though his writings leave room for debate about how he interpreted probability.

Evidentialism In the 20th century, evidentialism has had many defenders.

Evidentialism The key idea for evidentialists is that probability theory is the logic of partial entailment.

Evidentialism In deductive logic, the premisses of an argument either entail the conclusion or they do not. In inductive logic, the premisses of an argument may partially entail the conclusion. That is, the premisses might confirm the conclusion or make it more likely to be true.

Evidentialism Hence, probability becomes the degree of evidential support that some premisses give to some conclusion. Note that in this case, we are thinking about probability as applying to sentences in a language, not as applying to sets.

Evidentialism The evidentialist usually constrains probability assignments to reflect observed relative frequencies – and so evidentialism inherits some of the virtues of finite frequentism. However, there are disagreements.

Evidentialism One major split between the two approaches is over how to handle prior probabilities. Frequentists say that there are no genuine prior probabilities. Evidentialists apply symmetry principles to derive so-called non-informative priors.

Evidentialism Suppose we have a six-sided die that no one has ever thrown before. What is the probability that the first throw turns up six? Frequentists say that this probability is undefined. Evidentialists usually say that the probability is 1/6.

Evidentialism One connection between evidentialism and Laplace is Laplace’s Principle of Insufficient Reason. The name calls up a contrast with Leibniz’s Principle of Sufficient Reason.

Evidentialism Nothing happens without a reason. For every fact, there is an explanation why it is as it is and not otherwise.

Leibniz

Laplace

Evidentialism But when we do not know what is the case, we should say that all possibilities are equally likely.

Leibniz

Laplace

Evidentialism According to the Principle of Insufficient Reason: If an experiment has n possible outcomes, and I have no prior information, then I should assign the same probability, 1/n, to each possible outcome.

Evidentialism However symmetry principles are not always so straightforward to apply. Suppose I have a box with a label that says, “Inside this box is a cube, and each edge of that cube has a length between 1 cm and 3 cm.”

What is the probability that the edge length of the cube in the box is less than 2 cm?

Evidentialism By a symmetry argument, the probability should be 1/2.

Evidentialism By a symmetry argument, the probability should be 1/2. But now, consider. What is the probability that the area of one of the faces is less than 4 cm2?

Evidentialism By a symmetry argument, the probability should be 1/2. But now, consider. What is the probability that the area of one of the faces is less than 4 cm2?

By a symmetry argument, the probability should be 3/8.

Personalism

The probability of a sentence is the degree of belief one has that the sentence is true.

Personalism Personalism is usually associated with Bayes, although careful articulation and defense of the position had to wait for the 20th century work of probabilists like de Finetti and Savage.

Personalism Personalism is also sometimes called subjectivism.

Both personalism and evidentialism are often called Bayesian interpretations of probability.

Personalism The motivating idea behind personalism is that the probability one assigns to a sentence is a measure of the degree of belief (or uncertainty or ignorance) that one has with respect to the sentence.

In short, probability is in the head.

Personalism How does one determine the degree of belief that one has with respect to a sentence? Beliefs motivate actions. Hence, the degree of belief one has that some event will occur is reflected in the betting odds one will accept for that event.

Personalism For personalists, probability is not an objective feature of the world. In order to flag this fact, the following terminology is helpful: The chance that an event occurs is an objective feature of the world. The credence one has that an event occurs is a subjective feature of that agent.

Personalism Personalists are difficult to distinguish from evidentialists because they often adopt constraints on rational credences. For example, in order for some credences to be rational, they have to satisfy the axioms of probability.

Personalism The biggest difference between personalists and evidentialists is that evidentialists think there is always a uniquely correct credence given some evidence. Personalists disagree.

Personalism If probability is subjective, how can it support science?

Given enough evidence, the priors get swamped out.

Personalism Suppose there is a magician.

Personalism Suppose there is a magician.

Ricky Jay

Personalism Suppose there is a magician. Suppose that Ricky always carries two coins in his pocket. One is fair and one is double-headed.

Ricky Jay

Personalism Ricky is going to pull one coin out of his pocket, flip it several times, and report the outcomes. (Without showing the coin.)

Personalism Let’s label the sentences that we care about as follows:

Let d = double-headed coin is drawn. Let hn = the first n flips come up heads.

Personalism Now suppose that two people, Millie and Fred, assign personal prior probabilities as follows: Pm(d) = Pm(d) = 1/2

Pf(d) = 1/4

Pf(d) = 3/4

Personalism Ricky flips the coin five times and reports five heads. What posterior probabilities will Millie and Fred assign to the event d?

Personalism Suppose that for both people, P(hhhhh|d) = 1 and P(hhhhh|d) = 1/32. Then …

1  12 Pm (d | hhhhh)  1 1 1  1  2  32  2

1 2 33 64

32   0.97 33

1  14 Pf (d | hhhhh)  1 1 3  1  4  32  4

1 4 35 128

32   0.91 35

Personalism Bernstein and von Mises Theorem (first proved by Doob and later extended by Freedman and Diaconis). Under some weak assumptions, the posterior probabilities of two agents will become arbitrarily similar as the amount of data seen increases.

Next Time We will begin thinking about statistics!