Physically Based Rendering (600.657)

Physically Based Rendering (600.657) Monte Carlo Integration

Probability Definition: The probability density function (PDF) p(x) of a random variable X in a domain  describes the relative likelihood for this variable to occur at a given point in : 1. 0p(x) 2.

Ω

𝑝 𝑥 𝑑𝑥 =1

Probability Note: Given any subdomain D, the probability of the variable occurring within D is: Pr{ X  D}   p ( x)dx D

Probability Definition: For a (1D) domain =[a,b] the cumulative distribution function (CDF) P(x) of a random variable X is the probability that a value from the variable’s distribution is less than or equal to so some value x: x P ( x)  Pr{ X  x}   p ( x' )dx' a

Probability Note: Given a CDF on the domain =[a,b], we have: a

P(a)   p( x)dx 0 a

b

P(b)   p( x)dx 1 a

Probability Note: Given a CDF on the domain =[a,b], we have: a

P (a )   p ( x)dx 0

b

P (b)   p ( x)dx 1

a

a

And more generally: dP p( x)  dx

x

Probability Definition: Given a PDF p(x1,x2) on the domain 12, the marginal density function of x1 1 is obtained by integrating out one of the dimensions: p ( x1 )   p ( x1 , x2 )dx2 2

Probability Note: The marginal density function is also a PDF because it is non-negative and:  p( x1 )dx1    p( x1 , x2 )dx2 dx1  1 1

1  2

Probability Definition: The marginal densities p1 and p2 are independent if the probability distribution on 12 is the product of marginal probability distributions on 1and 2: p ( x1 , x2 )  p ( x1 ) p ( x2 )

Probability Definition: Given a PDF p(x1,x2) on the domain 12, the conditional density function of x22 given x11 is obtained by integrating out one of the dimensions: p ( x1 , x2 ) p ( x2 | x1 )  p ( x1 )

Probability Note: The conditional density function is a PDF because it is non-negative and: p ( x1 , x2 ) 1 p ( x1 )  p( x2 | x1 )dx2   p( x1 ) dx2  p( x1 )  p( x1 , x2 )dx2  p( x1 ) 1 2 2 2

Probability Definition: Given a PDF p on a domain , the expected value, Ep[f], of a function f is the average value of the function over a distribution of values p(x) over its domain: E p  f    f ( x) p ( x)dx 

Probability Note: For functions f, f1, and f2 and constant c: E p c   cp ( x)dx  c  p( x)dx  c


E p cf    cf ( x) p ( x)dx  c  f ( x) p ( x)dx  cE p [ f ]


E p cf    cf ( x) p( x)dx  c  f ( x) p( x)dx  cE p [ f ]

E p  f1  f 2     f1 ( x)  f 2 ( x)  p ( x)dx   f1 ( x) p ( x)dx   f 2 ( x) p ( x)dx  E p  f1   E p  f 2 

Probability Note: Given a probability distribution p on 12, and given function f1 on 1 and f2 on 2: E p  f1  f 2     f1 ( x1 )  f 2 ( x2 )  p ( x1 , x2 )dx2 dx1 1  2



 f ( x )  p( x , x )dx dx   f ( x )  p( x , x )dx dx 1

1

1

1

2

2

2

1

2

2

1

1

  f ( x1 ) p1 ( x1 )dx1   f ( x2 ) p2 ( x2 )dx2 

 E p1  f1   E p2  f 2 



2

1

2





1

1

1

2

2

2

1

2

2

1

1

  f ( x1 ) p1 ( x1 )dx1   f ( x2 ) p2 ( x2 )dx2 

 E p1  f1   E p2  f 2 



2

1

2





1

1

1

2

2

2

1

2

2

1

1

  f ( x1 ) p1 ( x1 )dx1   f ( x2 ) p2 ( x2 )dx2 

 E p1  f1   E p2  f 2 



2

1

2





1

1

1

2

2

2

1

2

2

1

2

1

1

  f ( x1 ) p1 ( x1 )dx1   f ( x2 ) p2 ( x2 )dx2 

 E p1  f1   E p2  f 2 



where p1 and p2 are the marginal distributions.

2

Probability Note: Given independent PDFs p1 on 1 and p2 on 2, and given functions f1 on 1 and f2 on 2: E p  f1 ( x1 )  f 2 ( x2 )   f1 ( x1 ) f 2 ( x2 ) p ( x1 , x2 )dx2 dx1 1  2



 f ( x ) p ( x )dx f 1

1

1

1

1 2

( x2 ) p2 ( x2 )dx2

1  2



 f ( x ) p ( x )dx  f 1

1

1

1

1

 E p1  f1  E p2  f 2 

1

2

2

( x2 ) p2 ( x2 )dx2




 f ( x ) p ( x )dx f 1

1

1

1

1 2

( x2 ) p2 ( x2 )dx2

1  2



 f ( x ) p ( x )dx  f 1

1

1

1

1

 E p1  f1  E p2  f 2 

1

2

2

( x2 ) p2 ( x2 )dx2




 f ( x ) p ( x )dx f 1

1

1

1

1 2

( x2 ) p2 ( x2 )dx2

1  2



 f ( x ) p ( x )dx  f 1

1

1

1

1

 E p1  f1  E p2  f 2 

1

2

2

( x2 ) p2 ( x2 )dx2




 f ( x ) p ( x )dx f 1

1

1

1

1 2

( x2 ) p2 ( x2 )dx2

1  2



 f ( x ) p ( x )dx  f 1

1

1

1

1

 E p1  f1  E p2  f 2 

1

2

2

( x2 ) p2 ( x2 )dx2

Probability Definition: The variance, Vp[f], of a function f is the expected deviation of the function from its expected value:



V p  f   E p  f  E p [ f ]

2

  E  f  E  f   E  f  E  f   E  f  E  f 



 E p f 2  E p [ f ]2  2 E p [ f ] f 2

2

p

p

2

2

p

p

2

2

p

p





 2E p E p [ f ] f



 2 E p  f E p [ f ]




V p  f   E p  f  E p [ f ]

2

  E  f  E  f   E  f  E  f   E  f  E  f 



 E p f 2  E p [ f ]2  2 E p [ f ] f 2

2

p

p

2

2

p

p

2

2

p

p





 2E p E p [ f ] f



 2 E p  f E p [ f ]




V p  f   E p  f  E p [ f ]

2

  E  f  E  f   E  f  E  f   E  f  E  f 



 E p f 2  E p [ f ]2  2 E p [ f ] f 2

2

p

p

2

2

p

p

2

2

p

p





 2E p E p [ f ] f



 2 E p  f E p [ f ]




V p  f   E p  f  E p [ f ]

2

  E  f  E  f   E  f  E  f   E  f  E  f 



 E p f 2  E p [ f ]2  2 E p [ f ] f 2

2

p

p

2

2

p

p

2

2

p

p





 2E p E p [ f ] f



 2 E p  f E p [ f ]




V p  f   E p  f  E p [ f ]

2

  E  f  E  f   E  f  E  f   E  f  E  f 



 E p f 2  E p [ f ]2  2 E p [ f ] f 2

2

p

p

2

2

p

p

2

2

p

p





 2E p E p [ f ] f



 2 E p  f E p [ f ]

Probability Note: For functions f and constant c:





  



V p cf   E p (cf ) 2  E p cf   c 2 E p f 2  E p  f   c 2V p  f  2

2

Probability Note: Given independent PDFs p1 on 1 and p2 on 2, and given functions f1 on 1 and f2 on 2: 2 2 V p  f1  f 2   E p  f1  f 2   E p  f1  f 2 

   E  f  E  f  2 E  f E  f   E  f   E  f   E  f  E  f   E  f  E  f   E p f  f  2 f1 f 2  E p  f1  f 2  2 1

p1

2 1

p1

2 1

2

2 2

p2

2

2 2

p1

2

p1

1

 V p1  f1   V p2  f 2 

p2

1

2 2

p2

2

p1

2

p2

2

1

p2

2



p1

2 1

p1

2 1

2

2 2

p2

2

2 2

p1

2

p1

1

 V p1  f1   V p2  f 2 

p2

1

2 2

p2

2

p1

2

p2

2

1

p2

2



p1

2 1

p1

2 1

2

2 2

p2

2

2 2

p1

2

p1

1

 V p1  f1   V p2  f 2 

p2

1

2 2

p2

2

p1

2

p2

2

1

p2

2



p1

2 1

p1

2 1

2

2 2

p2

2

2 2

p1

2

p1

1

 V p1  f1   V p2  f 2 

p2

1

2 2

p2

2

p1

2

p2

2

1

p2

2



p1

2 1

p1

2 1

2

2 2

p2

2

2 2

p1

2

p1

1

 V p1  f1   V p2  f 2 

p2

1

2 2

p2

2

p1

2

p2

2

1

p2

2

Probability Definition: Given a PDF p(x1,x2) on the domain 12 and given a function f on 12, the conditional expectation over 2 given x11 is: E1  f | x1  

 f ( x , x )  p( x 1

2

2

2

| x1 )dx2

Probability Definition: Given a PDF p(x1,x2) on the domain 12 and given a function f on 12, the conditional expectation over 2 given x11 is: E1  f | x1  

 f ( x , x )  p( x 1

2

2

2

| x1 )dx2

Similarly, the conditional variance over 2 given x11 is: 2 V  f | x1     f ( x1 , x2 )  E[ f | x1 ]  p ( x2 | x1 )dx2 1

2

Probability E1  f | x1   V1  f | x1  

Note:

 f ( x , x )  p( x 1

2

2

  f ( x1 , x2 )  E[ f

2

| x1 )dx2

| x1 ]  p ( x2 | x1 )dx2 2

2



 E [ f | x ] E E [ f | x ] E [ f | x ] E E [ f | x ]  E [ f | x ]  E E [ f | x ] V [ f | x ] V E [ f | x ]

V11 [ f ]  E11 [ f 2 ]  E11 [ f ]  E1  E1 E1  E1

2 2

2

2

2

1

1

2

1

2

2

1

1

2

1

2

2

2

2

1

1

1

1

2

2

1

1


Note:

 f ( x , x )  p( x 1

2

2

| x1 )dx2

2

2   f ( x , x )  E [ f | x ]  p( x2 | x1 )dx2 1  1 2

2




V11 [ f ]  E11 [ f 2 ]  E11 [ f ]  E1  E1 E1  E1

2 2

2

2

2

1

1

2

1

2

2

1

1

2

1

2

2

2

2

1

1

1

1

2

2

1

1


Note:

 f ( x , x )  p( x 1

2

2

| x1 )dx2

2

2   f ( x , x )  E [ f | x ]  p( x2 | x1 )dx2 1  1 2

2




V11 [ f ]  E11 [ f 2 ]  E11 [ f ]  E1  E1 E1  E1

2 2

2

2

2

1

1

2

1

2

2

1

1

2

1

2

2

2

2

1

1

1

1

2

2

1

1


Note:

 f ( x , x )  p( x 1

2

2

| x1 )dx2

2

2   f ( x , x )  E [ f | x ]  p( x2 | x1 )dx2 1  1 2

2




V11 [ f ]  E11 [ f 2 ]  E11 [ f ]  E1  E1 E1  E1

2 2

2

2

2

1

1

2

1

2

2

1

1

2

1

2

2

2

2

1

1

1

1

2

2

1

1

Monte Carlo Estimator Definition: Given a PDF p on domain  corresponding to random variable X, and given a function f on , the (n-th) Monte Carlo estimate of the integral of f over  is: 1 n f (Xi) In ( X )   n i 1 p ( X i )

Monte Carlo Estimator Note: The expected value of the estimate is: 1 n E p I n   E p    n i 1

 f  1 n f ( x) f 1 n   Ep      p( x)dx   f ( x)dx  p  n i 1  p  n i 1  p( x) 

Monte Carlo Estimator Note: The expected value of the estimate is: E p I n    f ( x)dx 

The expected value of the estimate is the integral of the function, regardless of the PDF. (So long as p(x)0 when f(x)0.)


The variance of the estimate is: 1 n f  1 n f 1 f   Vp I n  Vp     2 Vp    Vp   n

i 1

p

n

i 1

 p

n

 p


The variance of the estimate is: 1 f   Vp I n  Vp   n

 p

Variance decreases as: 1. We use more samples 2. The variance of f/p decreases (e.g. if pf.)

Sampling Challenge: Given a random variable X on domain  with PDF p, and given a bijective map F:F() what is the PDF q of the random variable F(X)?



F

F()

Sampling Challenge: For a given domain D F(), we know that the probability that F(x) is in D is equal to the probability that x in F-1(D).

D

F-1(D) 

F

F()

Sampling Challenge: For a given domain D F(), we know that the probability that F(x) is in D is equal to the probability that x in F-1(D). Thus, the differential probability of choosing the point F(x) is: p( x) q F ( x )   J F ( x) D

F-1(D) 

F

F()

Sampling

p( x) q F ( x )   J F ( x)

Definition: A canonical uniform random variable  is a random variable that takes on all values in the domain =[0,1] with equal probability: 1 x  [0,1] p( x)   0 otherwise

Sampling

p( x) q F ( x )   J F ( x)

Challenge: Given a canonical uniform random variable , how do we generate a uniform random variable in the range [a,b]?

Sampling

p( x) q F ( x )   J F ( x)

Solution: We would like a function F:[0,1][a,b] such that the probability of choosing F() is 1/(b-a).

Sampling

p( x) q F ( x )   J F ( x)

Solution: We would like a function F:[0,1][a,b] such that the probability of choosing F() is 1/(b-a). Thus, we get:  1  b  a  0

 1 F ( x )  [ a, b]    J F ( x)  0 otherwise

x  [0,1] otherwise

Sampling

p( x) q F ( x )   J F ( x)

Solution: We would like a function F:[0,1][a,b] such that the probability of choosing F() is 1/(b-a). Thus, we get:  1  b  a   0

 1 F ( x )  [ a , b]    J F ( x) otherwise   0

or in other words: F ' ( x)  (b  a) x [0,1]


Sampling

p( x) q F ( x )   J F ( x)




or in other words: F ' ( x)  (b  a ) x  [0,1]

F ( x)  b  a x  c x  [0,1]

Sampling

p( x) q F ( x )   J F ( x)




or in other words: F ' ( x)  (b  a ) x  [0,1]

F ( x)  b  a x  c x  [0,1] (b  a ) x  a  F ( x)    x  [0,1] ( a  b) x  b 

Sampling

p( x) q F ( x )   J F ( x)

Challenge: How about if we would like to generate a random variable in the range [a,b] with PDF q?

Sampling

p( x) q F ( x )   J F ( x)

Recall: Given F, the derivative of the inverse of F is:

F  1

1 ' 1 F ' F

Sampling

p( x) q F ( x )   J F ( x)

Recall: Given F, the derivative of the inverse of F is:

F  1

1 ' 1 F ' F

This follows by the chain-rule:

F  F  '  1 1



F ' F  F  '  1 1

1



F   1

1 ' F ' F 1 



Sampling

p( x) q F ( x )   J F ( x) 1 1 F ' F ' F 1

 

Challenge: How about if we would like to generate a random variable in the range [a,b] with PDF q? Solution: In this case we get: 1 F'  qF

Sampling

p( x) q F ( x )   J F ( x) 1 1 F  '  F ' F 1


Integrating to get Q’(x)=q(x):

 

1 1 1 1 1   F'     Q '  F '   Q '  F   Q  c  F   Q  c Q' F F ' F 1

Sampling

p( x) q F ( x )   J F ( x) 1 1 F  '  F ' F 1



 

1 1 1 1 1   F'     Q '  F '   Q '  F   Q  c  F   Q  c Q' F F ' F 1

Sampling

p( x) q F ( x )   J F ( x) 1 1 F  '  F ' F 1



 

1 1 1 1 1   F'     Q '  F '   Q '  F   Q  c  F   Q  c Q' F F ' F 1

Sampling

p( x) q F ( x )   J F ( x) 1 1 F  '  F ' F 1



 

1 1 1 1 1   F'     Q '  F '   Q '  F   Q  c  F   Q  c Q' F F ' F 1

Sampling

p( x) q F ( x )   J F ( x) 1 1 F  '  F ' F 1



 

1 1 1 1 1   F'     Q '  F '   Q '  F   Q  c  F   Q  c Q' F F ' F 1

Sampling

p( x) q F ( x )   J F ( x) 1 1 F  '  F ' F 1


F  Q  c 

1

 Q 1 ( ) F ( )    1  Q  1 ( )

Sampling Examples: If q(x)=cxn on the domain [a,b] 1. We normalize:

p( x) q F ( x )   J F ( x)  Q 1 ( ) F ( )    1  Q  1 ( )



b

1 1   q ( x)dx  c b n 1  a n 1 n 1 a  c

1 b

 q( x)dx a

n 1  n 1 b  a n 1



p( x) q F ( x )   J F ( x)  Q 1 ( ) F ( )    1  Q  1 ( )

Sampling Examples: If q(x)=cxn on the domain [a,b] 1. We normalize: n 1 q( x) 

b

n 1

a

2. Integrate to xget the CDF:

n 1

xn

n 1 n 1 n 1 x  a n Q( x)   n 1 y dy  n 1 n 1 n 1 b  a b  a a

p( x) q F ( x )   J F ( x)  Q 1 ( ) F ( )    1  Q  1 ( )

Sampling Examples: If q(x)=cxn on the domain [a,b] 1. We normalize: n 1 q( x) 

b

n 1

a

n 1

xn

2. Integrate to get the CDF:

x n 1  a n 1 Q( x)  n 1 b  a n 1

3. Invert the CDF:





 

n 1 n 1 n 1 1 / n 1    b a  a F ( )    n 1 n 1 n 1 1 / n 1  b  a   1  a 





Sampling

p( x) q F ( x )   J F ( x)  Q 1 ( ) F ( )    1  Q  1 ( )

Examples: Uniform random samples in a hemi-sphere: 1 q ( )  2

Sampling

p( x) q F ( x )   J F ( x)  Q 1 ( ) F ( )    1  Q  1 ( )

Examples: Uniform random samples in a hemi-sphere: 1 q ( )  2

In terms of the spherical parameterization:  ( ,  )  sin  cos  , sin  sin  , cos  

this gives:

sin  p ( ,  )  q  ( ,  )  J  ( ,  )  2

Sampling

p( x) q F ( x )   J F ( x)  Q 1 ( ) F ( )    1  Q  1 ( )

Examples: Uniform random samples in a hemi-sphere: sin  p ( ,  )  2

Recall: The probability p(,) is the product of the marginal and conditional probabilities: p ( ,  )  p ( )  p ( |  )

Sampling

p( x) q F ( x )   J F ( x)  Q 1 ( ) F ( )    1  Q  1 ( )

Examples: Uniform random samples in a hemi-sphere: sin  p( ,  )  2

The marginal probability of choosing  is: 2

2

sin  p ( )   p ( ,  )d   d  sin  2 0 0

Sampling

p( x) q F ( x )   J F ( x)  Q 1 ( ) F ( )    1  Q  1 ( )


The marginal probability of choosing  is: 2

2

sin  p( )   p( ,  )d   d  sin  2 0 0

The conditional probability of choosing  is: p ( ,  ) 1 p ( |  )   p ( ) 2

Sampling

p( x) q F ( x )   J F ( x)  Q 1 ( ) F ( )    1  Q  1 ( )

Examples: Uniform random samples in a hemi-sphere: sin  p( ,  )  2 1 p ( |  )  p ( )  sin  2

Integrating, we get the CDFs: 

P ( )   sin( ' )d '  1  cos  0

 P ( |  )  Q ( )  2

Sampling

p( x) q F ( x )   J F ( x)  Q 1 ( ) F ( )    1  Q  1 ( )


P ( )  1  cos 

 P( |  )  Q( )  2

Inverting, we get:   P(1 )  11  cos 1 1

  Q(1 )  11  2 2

Sampling

p( x) q F ( x )   J F ( x)  Q 1 ( ) F ( )    1  Q  1 ( )

Examples: Uniform random samples in a hemi-sphere:   cos 1 1

  2 2

Note: Even though this transforms uniform random variables in [0,1]2 to uniform random variables on H2, the distribution does not preserve areas, so stratification in [0,1]2 does not guarantee stratification on H2.

Rejection Sampling Challenge: What happens when  is complicated?



Rejection Sampling Challenge: What happens when  is complicated? Solution: Generate samples in a simpler domain  and discard samples that are not in . 



Rejection Sampling Challenge: What happens when  is complicated? Solution: Generate samples in a simpler domain  and discard samples that are not in . Note: Effectiveness of the sampling is Area()/Area()





Rejection Sampling Challenge: How do we sample according to a function f that is complicated (e.g. unknown PDF)?

f

Rejection Sampling Challenge: How do we sample according to a function f that is complicated (e.g. unknown PDF)? Solution: Find a simple PDF p and constant c such that fcp:

cp f


cp

– Generate sample x with probability p

f

x


’’

– Generate sample x with probability p – Generate a uniform random value [0,cp(x)].

cp f

'

x

Rejection Sampling Challenge: How do we sample according to a function f that is complicated (e.g. unknown PDF)? Solution: Find a simple PDF p and constant c such that fcp: – Generate sample x with probability p – Generate a uniform random value [0,cp(x)] – Keep x if 0, we can get uniqueness and convergence by setting:  f ( x2 )  T ( x2  x1 )   a ( x1  x2 )  min1,  f ( x1 )  T ( x1  x2 ) 

Metropolis Sampling Solution: More generally, if the state space is discrete, the probability of choosing state xj is:   p j  p j  T jj  p j   T ji  (1  a ji )   p j  Tij  aij  i j  i j

of starting with xj Probability of not starting as xj Probability  not and mutating to T xj .  a   out. f j  f j  1 and T jimutating f  T  ( 1  a )  f   1 1 i 1 i i i1 i1  i 1  i  j  i 1 f j  f j   f j  T ji   f j  T ji   f j  T ji  a ji   f i  Tij  aij i j

f i j

j

 T ji  a ji   f i  Tij  aij i j

f j  T ji  a ji  f i  Tij  aij i, j

i j

i j

i j


  f j  f j  1   T ji    f j  T ji  (1  a ji )   f i  Tij  ai1 i j  i j  i j f j  f j   f j  T ji   f j  T ji   f j  T ji  a ji   f i  Tij  aij i j

f i j

j



i j

i j

i j


  f j  f j  1   T ji    f j  T ji  (1  a ji )   f i  Tij  aij i j  i j  i j f j  f j   f j  T ji   f j  T ji   f j  T ji  a ji   f i  Tij  aij i j

f i j

j



i j

i j

i j


  f j  f j  1   T ji    f j  T ji  (1  a ji )   f i  Tij  aij i j  i j  i j f j  f j   f j  T ji   f j  T ji   f j  T ji  a ji   f i  Tij  aij i j

f i j

j

i j

i j

 T ji  a ji   f i   Tij f aj ijT ji  a ji   f i  Tij  aij i j

i j

i j

 Tiji, j a ji  f i  Tij  aij i, j f j  T ji  a ji  f i  Tij  aijf j 

i j

Metropolis Sampling Solution: More generally, if the state space is discrete, the probability of choosing state xj is:

f i j

j


which is satisfied if we ensure that: f j  T ji  a ji  f i  Tij  aij i  j

Metropolis Sampling Solution: More generally, if the state space is discrete, the probability of choosing state xj is:

f i j

j


which is satisfied if we ensure that: f j  T ji  a ji  f i  Tij  aij i  j

This not only ensures that mutating p we get back p, but also that repeatedly mutating and PDF q we get a multiple of p.

Metropolis Sampling Challenge 2: How should we choose the initial state x?

Metropolis Sampling Challenge 2: How should we choose the initial state x? Solution (Burn In): Run for many iterations and discard the initial samples.

Metropolis Sampling Solution: More generally, we can think of Metropolis sampling as taking some PDF p and mutating it into a new PDF q: qi   p j  T ji  a ji j

Metropolis Sampling Solution: More generally, we can think of Metropolis sampling as taking some PDF p and mutating it into a new PDF q: qi   p j  T ji  a ji j

Setting M be the matrix Mij=Tjiaji, and setting p={p1,…,pn} and q={q1,…,qn} this becomes:

q  Mp

Metropolis Sampling q  Mp Conditions on M: 1. Stationarity: Applying M to f we should get back f: Mf=f. (f is an eigenvector of M with eigenvalue 1). 2. Convergence: Repeatedly applying M to a PDF p we should converge to something. (All eigenvalues of M have magnitude 1.) 3. Uniqueness: That “something” should be a scalar multiple of f. (All other eigenvalues of M have magnitude