Probability. Definition: The probability density function (PDF) p(x) of a random
variable X in a domain Ω describes the relative likelihood for this variable to
occur ...
Physically Based Rendering (600.657) Monte Carlo Integration
Probability Definition: The probability density function (PDF) p(x) of a random variable X in a domain describes the relative likelihood for this variable to occur at a given point in : 1. 0p(x) 2.
Ω
𝑝 𝑥 𝑑𝑥 =1
Probability Note: Given any subdomain D, the probability of the variable occurring within D is: Pr{ X D} p ( x)dx D
Probability Definition: For a (1D) domain =[a,b] the cumulative distribution function (CDF) P(x) of a random variable X is the probability that a value from the variable’s distribution is less than or equal to so some value x: x P ( x) Pr{ X x} p ( x' )dx' a
Probability Note: Given a CDF on the domain =[a,b], we have: a
P(a) p( x)dx 0 a
b
P(b) p( x)dx 1 a
Probability Note: Given a CDF on the domain =[a,b], we have: a
P (a ) p ( x)dx 0
b
P (b) p ( x)dx 1
a
a
And more generally: dP p( x) dx
x
Probability Definition: Given a PDF p(x1,x2) on the domain 12, the marginal density function of x1 1 is obtained by integrating out one of the dimensions: p ( x1 ) p ( x1 , x2 )dx2 2
Probability Note: The marginal density function is also a PDF because it is non-negative and: p( x1 )dx1 p( x1 , x2 )dx2 dx1 1 1
1 2
Probability Definition: The marginal densities p1 and p2 are independent if the probability distribution on 12 is the product of marginal probability distributions on 1and 2: p ( x1 , x2 ) p ( x1 ) p ( x2 )
Probability Definition: Given a PDF p(x1,x2) on the domain 12, the conditional density function of x22 given x11 is obtained by integrating out one of the dimensions: p ( x1 , x2 ) p ( x2 | x1 ) p ( x1 )
Probability Note: The conditional density function is a PDF because it is non-negative and: p ( x1 , x2 ) 1 p ( x1 ) p( x2 | x1 )dx2 p( x1 ) dx2 p( x1 ) p( x1 , x2 )dx2 p( x1 ) 1 2 2 2
Probability Definition: Given a PDF p on a domain , the expected value, Ep[f], of a function f is the average value of the function over a distribution of values p(x) over its domain: E p f f ( x) p ( x)dx
Probability Note: For functions f, f1, and f2 and constant c: E p c cp ( x)dx c p( x)dx c
Probability Note: For functions f, f1, and f2 and constant c: E p c cp ( x)dx c p( x)dx c
E p cf cf ( x) p ( x)dx c f ( x) p ( x)dx cE p [ f ]
Probability Note: For functions f, f1, and f2 and constant c: E p c cp ( x)dx c p( x)dx c
E p cf cf ( x) p( x)dx c f ( x) p( x)dx cE p [ f ]
E p f1 f 2 f1 ( x) f 2 ( x) p ( x)dx f1 ( x) p ( x)dx f 2 ( x) p ( x)dx E p f1 E p f 2
Probability Note: Given a probability distribution p on 12, and given function f1 on 1 and f2 on 2: E p f1 f 2 f1 ( x1 ) f 2 ( x2 ) p ( x1 , x2 )dx2 dx1 1 2
f ( x ) p( x , x )dx dx f ( x ) p( x , x )dx dx 1
1
1
1
2
2
2
1
2
2
1
1
f ( x1 ) p1 ( x1 )dx1 f ( x2 ) p2 ( x2 )dx2
E p1 f1 E p2 f 2
2
1
2
Probability Note: Given a probability distribution p on 12, and given function f1 on 1 and f2 on 2: E p f1 f 2 f1 ( x1 ) f 2 ( x2 ) p ( x1 , x2 )dx2 dx1 1 2
f ( x ) p( x , x )dx dx f ( x ) p( x , x )dx dx 1
1
1
1
2
2
2
1
2
2
1
1
f ( x1 ) p1 ( x1 )dx1 f ( x2 ) p2 ( x2 )dx2
E p1 f1 E p2 f 2
2
1
2
Probability Note: Given a probability distribution p on 12, and given function f1 on 1 and f2 on 2: E p f1 f 2 f1 ( x1 ) f 2 ( x2 ) p ( x1 , x2 )dx2 dx1 1 2
f ( x ) p( x , x )dx dx f ( x ) p( x , x )dx dx 1
1
1
1
2
2
2
1
2
2
1
1
f ( x1 ) p1 ( x1 )dx1 f ( x2 ) p2 ( x2 )dx2
E p1 f1 E p2 f 2
2
1
2
Probability Note: Given a probability distribution p on 12, and given function f1 on 1 and f2 on 2: E p f1 f 2 f1 ( x1 ) f 2 ( x2 ) p ( x1 , x2 )dx2 dx1 1 2
f ( x ) p( x , x )dx dx f ( x ) p( x , x )dx dx 1
1
1
1
2
2
2
1
2
2
1
2
1
1
f ( x1 ) p1 ( x1 )dx1 f ( x2 ) p2 ( x2 )dx2
E p1 f1 E p2 f 2
where p1 and p2 are the marginal distributions.
2
Probability Note: Given independent PDFs p1 on 1 and p2 on 2, and given functions f1 on 1 and f2 on 2: E p f1 ( x1 ) f 2 ( x2 ) f1 ( x1 ) f 2 ( x2 ) p ( x1 , x2 )dx2 dx1 1 2
f ( x ) p ( x )dx f 1
1
1
1
1 2
( x2 ) p2 ( x2 )dx2
1 2
f ( x ) p ( x )dx f 1
1
1
1
1
E p1 f1 E p2 f 2
1
2
2
( x2 ) p2 ( x2 )dx2
Probability Note: Given independent PDFs p1 on 1 and p2 on 2, and given functions f1 on 1 and f2 on 2: E p f1 ( x1 ) f 2 ( x2 ) f1 ( x1 ) f 2 ( x2 ) p ( x1 , x2 )dx2 dx1 1 2
f ( x ) p ( x )dx f 1
1
1
1
1 2
( x2 ) p2 ( x2 )dx2
1 2
f ( x ) p ( x )dx f 1
1
1
1
1
E p1 f1 E p2 f 2
1
2
2
( x2 ) p2 ( x2 )dx2
Probability Note: Given independent PDFs p1 on 1 and p2 on 2, and given functions f1 on 1 and f2 on 2: E p f1 ( x1 ) f 2 ( x2 ) f1 ( x1 ) f 2 ( x2 ) p ( x1 , x2 )dx2 dx1 1 2
f ( x ) p ( x )dx f 1
1
1
1
1 2
( x2 ) p2 ( x2 )dx2
1 2
f ( x ) p ( x )dx f 1
1
1
1
1
E p1 f1 E p2 f 2
1
2
2
( x2 ) p2 ( x2 )dx2
Probability Note: Given independent PDFs p1 on 1 and p2 on 2, and given functions f1 on 1 and f2 on 2: E p f1 ( x1 ) f 2 ( x2 ) f1 ( x1 ) f 2 ( x2 ) p ( x1 , x2 )dx2 dx1 1 2
f ( x ) p ( x )dx f 1
1
1
1
1 2
( x2 ) p2 ( x2 )dx2
1 2
f ( x ) p ( x )dx f 1
1
1
1
1
E p1 f1 E p2 f 2
1
2
2
( x2 ) p2 ( x2 )dx2
Probability Definition: The variance, Vp[f], of a function f is the expected deviation of the function from its expected value:
V p f E p f E p [ f ]
2
E f E f E f E f E f E f
E p f 2 E p [ f ]2 2 E p [ f ] f 2
2
p
p
2
2
p
p
2
2
p
p
2E p E p [ f ] f
2 E p f E p [ f ]
Probability Definition: The variance, Vp[f], of a function f is the expected deviation of the function from its expected value:
V p f E p f E p [ f ]
2
E f E f E f E f E f E f
E p f 2 E p [ f ]2 2 E p [ f ] f 2
2
p
p
2
2
p
p
2
2
p
p
2E p E p [ f ] f
2 E p f E p [ f ]
Probability Definition: The variance, Vp[f], of a function f is the expected deviation of the function from its expected value:
V p f E p f E p [ f ]
2
E f E f E f E f E f E f
E p f 2 E p [ f ]2 2 E p [ f ] f 2
2
p
p
2
2
p
p
2
2
p
p
2E p E p [ f ] f
2 E p f E p [ f ]
Probability Definition: The variance, Vp[f], of a function f is the expected deviation of the function from its expected value:
V p f E p f E p [ f ]
2
E f E f E f E f E f E f
E p f 2 E p [ f ]2 2 E p [ f ] f 2
2
p
p
2
2
p
p
2
2
p
p
2E p E p [ f ] f
2 E p f E p [ f ]
Probability Definition: The variance, Vp[f], of a function f is the expected deviation of the function from its expected value:
V p f E p f E p [ f ]
2
E f E f E f E f E f E f
E p f 2 E p [ f ]2 2 E p [ f ] f 2
2
p
p
2
2
p
p
2
2
p
p
2E p E p [ f ] f
2 E p f E p [ f ]
Probability Note: For functions f and constant c:
V p cf E p (cf ) 2 E p cf c 2 E p f 2 E p f c 2V p f 2
2
Probability Note: Given independent PDFs p1 on 1 and p2 on 2, and given functions f1 on 1 and f2 on 2: 2 2 V p f1 f 2 E p f1 f 2 E p f1 f 2
E f E f 2 E f E f E f E f E f E f E f E f E p f f 2 f1 f 2 E p f1 f 2 2 1
p1
2 1
p1
2 1
2
2 2
p2
2
2 2
p1
2
p1
1
V p1 f1 V p2 f 2
p2
1
2 2
p2
2
p1
2
p2
2
1
p2
2
Probability Note: Given independent PDFs p1 on 1 and p2 on 2, and given functions f1 on 1 and f2 on 2: 2 2 V p f1 f 2 E p f1 f 2 E p f1 f 2
E f E f 2 E f E f E f E f E f E f E f E f E p f f 2 f1 f 2 E p f1 f 2 2 1
p1
2 1
p1
2 1
2
2 2
p2
2
2 2
p1
2
p1
1
V p1 f1 V p2 f 2
p2
1
2 2
p2
2
p1
2
p2
2
1
p2
2
Probability Note: Given independent PDFs p1 on 1 and p2 on 2, and given functions f1 on 1 and f2 on 2: 2 2 V p f1 f 2 E p f1 f 2 E p f1 f 2
E f E f 2 E f E f E f E f E f E f E f E f E p f f 2 f1 f 2 E p f1 f 2 2 1
p1
2 1
p1
2 1
2
2 2
p2
2
2 2
p1
2
p1
1
V p1 f1 V p2 f 2
p2
1
2 2
p2
2
p1
2
p2
2
1
p2
2
Probability Note: Given independent PDFs p1 on 1 and p2 on 2, and given functions f1 on 1 and f2 on 2: 2 2 V p f1 f 2 E p f1 f 2 E p f1 f 2
E f E f 2 E f E f E f E f E f E f E f E f E p f f 2 f1 f 2 E p f1 f 2 2 1
p1
2 1
p1
2 1
2
2 2
p2
2
2 2
p1
2
p1
1
V p1 f1 V p2 f 2
p2
1
2 2
p2
2
p1
2
p2
2
1
p2
2
Probability Note: Given independent PDFs p1 on 1 and p2 on 2, and given functions f1 on 1 and f2 on 2: 2 2 V p f1 f 2 E p f1 f 2 E p f1 f 2
E f E f 2 E f E f E f E f E f E f E f E f E p f f 2 f1 f 2 E p f1 f 2 2 1
p1
2 1
p1
2 1
2
2 2
p2
2
2 2
p1
2
p1
1
V p1 f1 V p2 f 2
p2
1
2 2
p2
2
p1
2
p2
2
1
p2
2
Probability Definition: Given a PDF p(x1,x2) on the domain 12 and given a function f on 12, the conditional expectation over 2 given x11 is: E1 f | x1
f ( x , x ) p( x 1
2
2
2
| x1 )dx2
Probability Definition: Given a PDF p(x1,x2) on the domain 12 and given a function f on 12, the conditional expectation over 2 given x11 is: E1 f | x1
f ( x , x ) p( x 1
2
2
2
| x1 )dx2
Similarly, the conditional variance over 2 given x11 is: 2 V f | x1 f ( x1 , x2 ) E[ f | x1 ] p ( x2 | x1 )dx2 1
2
Probability E1 f | x1 V1 f | x1
Note:
f ( x , x ) p( x 1
2
2
f ( x1 , x2 ) E[ f
2
| x1 )dx2
| x1 ] p ( x2 | x1 )dx2 2
2
E [ f | x ] E E [ f | x ] E [ f | x ] E E [ f | x ] E [ f | x ] E E [ f | x ] V [ f | x ] V E [ f | x ]
V11 [ f ] E11 [ f 2 ] E11 [ f ] E1 E1 E1 E1
2 2
2
2
2
1
1
2
1
2
2
1
1
2
1
2
2
2
2
1
1
1
1
2
2
1
1
Probability E1 f | x1 V1 f | x1
Note:
f ( x , x ) p( x 1
2
2
| x1 )dx2
2
2 f ( x , x ) E [ f | x ] p( x2 | x1 )dx2 1 1 2
2
E [ f | x ] E E [ f | x ] E [ f | x ] E E [ f | x ] E [ f | x ] E E [ f | x ] V [ f | x ] V E [ f | x ]
V11 [ f ] E11 [ f 2 ] E11 [ f ] E1 E1 E1 E1
2 2
2
2
2
1
1
2
1
2
2
1
1
2
1
2
2
2
2
1
1
1
1
2
2
1
1
Probability E1 f | x1 V1 f | x1
Note:
f ( x , x ) p( x 1
2
2
| x1 )dx2
2
2 f ( x , x ) E [ f | x ] p( x2 | x1 )dx2 1 1 2
2
E [ f | x ] E E [ f | x ] E [ f | x ] E E [ f | x ] E [ f | x ] E E [ f | x ] V [ f | x ] V E [ f | x ]
V11 [ f ] E11 [ f 2 ] E11 [ f ] E1 E1 E1 E1
2 2
2
2
2
1
1
2
1
2
2
1
1
2
1
2
2
2
2
1
1
1
1
2
2
1
1
Probability E1 f | x1 V1 f | x1
Note:
f ( x , x ) p( x 1
2
2
| x1 )dx2
2
2 f ( x , x ) E [ f | x ] p( x2 | x1 )dx2 1 1 2
2
E [ f | x ] E E [ f | x ] E [ f | x ] E E [ f | x ] E [ f | x ] E E [ f | x ] V [ f | x ] V E [ f | x ]
V11 [ f ] E11 [ f 2 ] E11 [ f ] E1 E1 E1 E1
2 2
2
2
2
1
1
2
1
2
2
1
1
2
1
2
2
2
2
1
1
1
1
2
2
1
1
Monte Carlo Estimator Definition: Given a PDF p on domain corresponding to random variable X, and given a function f on , the (n-th) Monte Carlo estimate of the integral of f over is: 1 n f (Xi) In ( X ) n i 1 p ( X i )
Monte Carlo Estimator Note: The expected value of the estimate is: 1 n E p I n E p n i 1
f 1 n f ( x) f 1 n Ep p( x)dx f ( x)dx p n i 1 p n i 1 p( x)
Monte Carlo Estimator Note: The expected value of the estimate is: E p I n f ( x)dx
The expected value of the estimate is the integral of the function, regardless of the PDF. (So long as p(x)0 when f(x)0.)
Monte Carlo Estimator Note: The expected value of the estimate is: E p I n f ( x)dx
The variance of the estimate is: 1 n f 1 n f 1 f Vp I n Vp 2 Vp Vp n
i 1
p
n
i 1
p
n
p
Monte Carlo Estimator Note: The expected value of the estimate is: E p I n f ( x)dx
The variance of the estimate is: 1 f Vp I n Vp n
p
Variance decreases as: 1. We use more samples 2. The variance of f/p decreases (e.g. if pf.)
Sampling Challenge: Given a random variable X on domain with PDF p, and given a bijective map F:F() what is the PDF q of the random variable F(X)?
F
F()
Sampling Challenge: For a given domain D F(), we know that the probability that F(x) is in D is equal to the probability that x in F-1(D).
D
F-1(D)
F
F()
Sampling Challenge: For a given domain D F(), we know that the probability that F(x) is in D is equal to the probability that x in F-1(D). Thus, the differential probability of choosing the point F(x) is: p( x) q F ( x ) J F ( x) D
F-1(D)
F
F()
Sampling
p( x) q F ( x ) J F ( x)
Definition: A canonical uniform random variable is a random variable that takes on all values in the domain =[0,1] with equal probability: 1 x [0,1] p( x) 0 otherwise
Sampling
p( x) q F ( x ) J F ( x)
Challenge: Given a canonical uniform random variable , how do we generate a uniform random variable in the range [a,b]?
Sampling
p( x) q F ( x ) J F ( x)
Solution: We would like a function F:[0,1][a,b] such that the probability of choosing F() is 1/(b-a).
Sampling
p( x) q F ( x ) J F ( x)
Solution: We would like a function F:[0,1][a,b] such that the probability of choosing F() is 1/(b-a). Thus, we get: 1 b a 0
1 F ( x ) [ a, b] J F ( x) 0 otherwise
x [0,1] otherwise
Sampling
p( x) q F ( x ) J F ( x)
Solution: We would like a function F:[0,1][a,b] such that the probability of choosing F() is 1/(b-a). Thus, we get: 1 b a 0
1 F ( x ) [ a , b] J F ( x) otherwise 0
or in other words: F ' ( x) (b a) x [0,1]
x [0,1] otherwise
Sampling
p( x) q F ( x ) J F ( x)
Solution: We would like a function F:[0,1][a,b] such that the probability of choosing F() is 1/(b-a). Thus, we get: 1 b a 0
1 F ( x ) [ a , b] J F ( x) otherwise 0
x [0,1] otherwise
or in other words: F ' ( x) (b a ) x [0,1]
F ( x) b a x c x [0,1]
Sampling
p( x) q F ( x ) J F ( x)
Solution: We would like a function F:[0,1][a,b] such that the probability of choosing F() is 1/(b-a). Thus, we get: 1 b a 0
1 F ( x ) [ a , b] J F ( x) otherwise 0
x [0,1] otherwise
or in other words: F ' ( x) (b a ) x [0,1]
F ( x) b a x c x [0,1] (b a ) x a F ( x) x [0,1] ( a b) x b
Sampling
p( x) q F ( x ) J F ( x)
Challenge: How about if we would like to generate a random variable in the range [a,b] with PDF q?
Sampling
p( x) q F ( x ) J F ( x)
Recall: Given F, the derivative of the inverse of F is:
F 1
1 ' 1 F ' F
Sampling
p( x) q F ( x ) J F ( x)
Recall: Given F, the derivative of the inverse of F is:
F 1
1 ' 1 F ' F
This follows by the chain-rule:
F F ' 1 1
F ' F F ' 1 1
1
F 1
1 ' F ' F 1
Sampling
p( x) q F ( x ) J F ( x) 1 1 F ' F ' F 1
Challenge: How about if we would like to generate a random variable in the range [a,b] with PDF q? Solution: In this case we get: 1 F' qF
Sampling
p( x) q F ( x ) J F ( x) 1 1 F ' F ' F 1
Challenge: How about if we would like to generate a random variable in the range [a,b] with PDF q? Solution: In this case we get: 1 F' qF
Integrating to get Q’(x)=q(x):
1 1 1 1 1 F' Q ' F ' Q ' F Q c F Q c Q' F F ' F 1
Sampling
p( x) q F ( x ) J F ( x) 1 1 F ' F ' F 1
Challenge: How about if we would like to generate a random variable in the range [a,b] with PDF q? Solution: In this case we get: 1 F' qF
Integrating to get Q’(x)=q(x):
1 1 1 1 1 F' Q ' F ' Q ' F Q c F Q c Q' F F ' F 1
Sampling
p( x) q F ( x ) J F ( x) 1 1 F ' F ' F 1
Challenge: How about if we would like to generate a random variable in the range [a,b] with PDF q? Solution: In this case we get: 1 F' qF
Integrating to get Q’(x)=q(x):
1 1 1 1 1 F' Q ' F ' Q ' F Q c F Q c Q' F F ' F 1
Sampling
p( x) q F ( x ) J F ( x) 1 1 F ' F ' F 1
Challenge: How about if we would like to generate a random variable in the range [a,b] with PDF q? Solution: In this case we get: 1 F' qF
Integrating to get Q’(x)=q(x):
1 1 1 1 1 F' Q ' F ' Q ' F Q c F Q c Q' F F ' F 1
Sampling
p( x) q F ( x ) J F ( x) 1 1 F ' F ' F 1
Challenge: How about if we would like to generate a random variable in the range [a,b] with PDF q? Solution: In this case we get: 1 F' qF
Integrating to get Q’(x)=q(x):
1 1 1 1 1 F' Q ' F ' Q ' F Q c F Q c Q' F F ' F 1
Sampling
p( x) q F ( x ) J F ( x) 1 1 F ' F ' F 1
Challenge: How about if we would like to generate a random variable in the range [a,b] with PDF q? Solution: In this case we get: 1 F' qF
F Q c
1
Q 1 ( ) F ( ) 1 Q 1 ( )
Sampling Examples: If q(x)=cxn on the domain [a,b] 1. We normalize:
p( x) q F ( x ) J F ( x) Q 1 ( ) F ( ) 1 Q 1 ( )
b
1 1 q ( x)dx c b n 1 a n 1 n 1 a c
1 b
q( x)dx a
n 1 n 1 b a n 1
p( x) q F ( x ) J F ( x) Q 1 ( ) F ( ) 1 Q 1 ( )
Sampling Examples: If q(x)=cxn on the domain [a,b] 1. We normalize: n 1 q( x)
b
n 1
a
2. Integrate to xget the CDF:
n 1
xn
n 1 n 1 n 1 x a n Q( x) n 1 y dy n 1 n 1 n 1 b a b a a
p( x) q F ( x ) J F ( x) Q 1 ( ) F ( ) 1 Q 1 ( )
Sampling Examples: If q(x)=cxn on the domain [a,b] 1. We normalize: n 1 q( x)
b
n 1
a
n 1
xn
2. Integrate to get the CDF:
x n 1 a n 1 Q( x) n 1 b a n 1
3. Invert the CDF:
n 1 n 1 n 1 1 / n 1 b a a F ( ) n 1 n 1 n 1 1 / n 1 b a 1 a
Sampling
p( x) q F ( x ) J F ( x) Q 1 ( ) F ( ) 1 Q 1 ( )
Examples: Uniform random samples in a hemi-sphere: 1 q ( ) 2
Sampling
p( x) q F ( x ) J F ( x) Q 1 ( ) F ( ) 1 Q 1 ( )
Examples: Uniform random samples in a hemi-sphere: 1 q ( ) 2
In terms of the spherical parameterization: ( , ) sin cos , sin sin , cos
this gives:
sin p ( , ) q ( , ) J ( , ) 2
Sampling
p( x) q F ( x ) J F ( x) Q 1 ( ) F ( ) 1 Q 1 ( )
Examples: Uniform random samples in a hemi-sphere: sin p ( , ) 2
Recall: The probability p(,) is the product of the marginal and conditional probabilities: p ( , ) p ( ) p ( | )
Sampling
p( x) q F ( x ) J F ( x) Q 1 ( ) F ( ) 1 Q 1 ( )
Examples: Uniform random samples in a hemi-sphere: sin p( , ) 2
The marginal probability of choosing is: 2
2
sin p ( ) p ( , )d d sin 2 0 0
Sampling
p( x) q F ( x ) J F ( x) Q 1 ( ) F ( ) 1 Q 1 ( )
Examples: Uniform random samples in a hemi-sphere: sin p( , ) 2
The marginal probability of choosing is: 2
2
sin p( ) p( , )d d sin 2 0 0
The conditional probability of choosing is: p ( , ) 1 p ( | ) p ( ) 2
Sampling
p( x) q F ( x ) J F ( x) Q 1 ( ) F ( ) 1 Q 1 ( )
Examples: Uniform random samples in a hemi-sphere: sin p( , ) 2 1 p ( | ) p ( ) sin 2
Integrating, we get the CDFs:
P ( ) sin( ' )d ' 1 cos 0
P ( | ) Q ( ) 2
Sampling
p( x) q F ( x ) J F ( x) Q 1 ( ) F ( ) 1 Q 1 ( )
Examples: Uniform random samples in a hemi-sphere: sin p( , ) 2
P ( ) 1 cos
P( | ) Q( ) 2
Inverting, we get: P(1 ) 11 cos 1 1
Q(1 ) 11 2 2
Sampling
p( x) q F ( x ) J F ( x) Q 1 ( ) F ( ) 1 Q 1 ( )
Examples: Uniform random samples in a hemi-sphere: cos 1 1
2 2
Note: Even though this transforms uniform random variables in [0,1]2 to uniform random variables on H2, the distribution does not preserve areas, so stratification in [0,1]2 does not guarantee stratification on H2.
Rejection Sampling Challenge: What happens when is complicated?
Rejection Sampling Challenge: What happens when is complicated? Solution: Generate samples in a simpler domain and discard samples that are not in .
Rejection Sampling Challenge: What happens when is complicated? Solution: Generate samples in a simpler domain and discard samples that are not in . Note: Effectiveness of the sampling is Area()/Area()
Rejection Sampling Challenge: How do we sample according to a function f that is complicated (e.g. unknown PDF)?
f
Rejection Sampling Challenge: How do we sample according to a function f that is complicated (e.g. unknown PDF)? Solution: Find a simple PDF p and constant c such that fcp:
cp f
Rejection Sampling Challenge: How do we sample according to a function f that is complicated (e.g. unknown PDF)? Solution: Find a simple PDF p and constant c such that fcp:
cp
– Generate sample x with probability p
f
x
Rejection Sampling Challenge: How do we sample according to a function f that is complicated (e.g. unknown PDF)? Solution: Find a simple PDF p and constant c such that fcp:
’’
– Generate sample x with probability p – Generate a uniform random value [0,cp(x)].
cp f
'
x
Rejection Sampling Challenge: How do we sample according to a function f that is complicated (e.g. unknown PDF)? Solution: Find a simple PDF p and constant c such that fcp: – Generate sample x with probability p – Generate a uniform random value [0,cp(x)] – Keep x if 0, we can get uniqueness and convergence by setting: f ( x2 ) T ( x2 x1 ) a ( x1 x2 ) min1, f ( x1 ) T ( x1 x2 )
Metropolis Sampling Solution: More generally, if the state space is discrete, the probability of choosing state xj is: p j p j T jj p j T ji (1 a ji ) p j Tij aij i j i j
of starting with xj Probability of not starting as xj Probability not and mutating to T xj . a out. f j f j 1 and T jimutating f T ( 1 a ) f 1 1 i 1 i i i1 i1 i 1 i j i 1 f j f j f j T ji f j T ji f j T ji a ji f i Tij aij i j
f i j
j
T ji a ji f i Tij aij i j
f j T ji a ji f i Tij aij i, j
i j
i j
i j
Metropolis Sampling Solution: More generally, if the state space is discrete, the probability of choosing state xj is: p j p j T jj p j T ji (1 a ji ) p j Tij aij i j i j
f j f j 1 T ji f j T ji (1 a ji ) f i Tij ai1 i j i j i j f j f j f j T ji f j T ji f j T ji a ji f i Tij aij i j
f i j
j
T ji a ji f i Tij aij i j
f j T ji a ji f i Tij aij i, j
i j
i j
i j
Metropolis Sampling Solution: More generally, if the state space is discrete, the probability of choosing state xj is: p j p j T jj p j T ji (1 a ji ) p j Tij aij i j i j
f j f j 1 T ji f j T ji (1 a ji ) f i Tij aij i j i j i j f j f j f j T ji f j T ji f j T ji a ji f i Tij aij i j
f i j
j
T ji a ji f i Tij aij i j
f j T ji a ji f i Tij aij i, j
i j
i j
i j
Metropolis Sampling Solution: More generally, if the state space is discrete, the probability of choosing state xj is: p j p j T jj p j T ji (1 a ji ) p j Tij aij i j i j
f j f j 1 T ji f j T ji (1 a ji ) f i Tij aij i j i j i j f j f j f j T ji f j T ji f j T ji a ji f i Tij aij i j
f i j
j
i j
i j
T ji a ji f i Tij f aj ijT ji a ji f i Tij aij i j
i j
i j
Tiji, j a ji f i Tij aij i, j f j T ji a ji f i Tij aijf j
i j
Metropolis Sampling Solution: More generally, if the state space is discrete, the probability of choosing state xj is:
f i j
j
T ji a ji f i Tij aij i j
which is satisfied if we ensure that: f j T ji a ji f i Tij aij i j
Metropolis Sampling Solution: More generally, if the state space is discrete, the probability of choosing state xj is:
f i j
j
T ji a ji f i Tij aij i j
which is satisfied if we ensure that: f j T ji a ji f i Tij aij i j
This not only ensures that mutating p we get back p, but also that repeatedly mutating and PDF q we get a multiple of p.
Metropolis Sampling Challenge 2: How should we choose the initial state x?
Metropolis Sampling Challenge 2: How should we choose the initial state x? Solution (Burn In): Run for many iterations and discard the initial samples.
Metropolis Sampling Solution: More generally, we can think of Metropolis sampling as taking some PDF p and mutating it into a new PDF q: qi p j T ji a ji j
Metropolis Sampling Solution: More generally, we can think of Metropolis sampling as taking some PDF p and mutating it into a new PDF q: qi p j T ji a ji j
Setting M be the matrix Mij=Tjiaji, and setting p={p1,…,pn} and q={q1,…,qn} this becomes:
q Mp
Metropolis Sampling q Mp Conditions on M: 1. Stationarity: Applying M to f we should get back f: Mf=f. (f is an eigenvector of M with eigenvalue 1). 2. Convergence: Repeatedly applying M to a PDF p we should converge to something. (All eigenvalues of M have magnitude 1.) 3. Uniqueness: That “something” should be a scalar multiple of f. (All other eigenvalues of M have magnitude