Frequentist Accuracy of Bayesian Estimates - Stanford University

10 downloads 134 Views 374KB Size Report
E{θ|x} = ∫. Ω t(µ)fµ(x)g(µ) dµ. /∫. Ω. fµ(x)g(µ) dµ. What if we don't know g? Bradley Efron (Stanford University). Frequentist Accuracy of Bayesian Estimates.
Frequentist Accuracy of Bayesian Estimates Bradley Efron Stanford University

Bayesian Inference Parameter: µ ∈ Ω Observed data: x Prior:

π(µ) n o fµ (x), µ ∈ Ω

Probability distributions: Parameter of interest:

θ = t(µ) ,Z t(µ)fµ (x)π(µ) dµ fµ (x)π(µ) dµ

Z E{θ|x} = Ω



What if we don’t know g? Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

2 / 30

Jeffreysonian Bayes Inference “Uninformative Priors”

Jeffreys:

n o 1/2 π(µ) = I(µ) where I(µ) = cov ∇µ log fµ (x)

(the Fisher information matrix) Can still use Bayes theorem but how accurate are the estimates?  Today: frequentist variability of E t(µ)|x

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

3 / 30

General Accuracy Formula µ and x ∈ Rp x ∼ (µ, Vµ )  T ∂ log fµ (x) αx (µ) = ∇x log fµ (x) = . . . , ∂xi , . . . Lemma ˆ = E t(µ)|x has gradient ∇x E = cov t(µ), αx (µ)|x . E Theorem The delta-method standard deviation of E is   h i ˆ = cov t(µ), αx (µ)|x T Vx cov t(µ), αx (µ)|x 1/2 . sd E Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

4 / 30

Implementation

Posterior Sample µ∗1 , µ∗2 , . . . , µ∗B θˆ∗i = t(µ∗i ) = ti∗

and

(MCMC)

α∗i = αx (µ∗i )

cd ov =

B  X

 . ¯ ti∗ − ¯t B α∗i − α

i=1

1/2    T b E ˆ = cd ov sd ov Vx cd

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

5 / 30

Diabetes Data Efron et al. (2004), “LARS”

n = 442 subjects p = 10 predictors: age, sex, bmi, glu,. . . Response: Model:

y = disease progression at one year

y = X n×1

Bradley Efron (Stanford University)

α + e

n×p p×1

n×1

[e ∼ Nn (0, I)]

Frequentist Accuracy of Bayesian Estimates

6 / 30

15

Diabetes data. Lasso: min{RSS/2 + gamma*L1(beta)} Cp minimized at Step 7, with gamma = .37

10

ltg

map

0

5





tch hdl glu



age

● ● ●

sex

−15

−10

−5

beta vals

bmi ldl

● ●

gamma: 16.44

8.37

0

2

5.84

2.41 4

1.64

1.27 6

Step 7 0.37

0.1 8

0.09

0.04 10

0.02

tc 0 12

step

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

7 / 30

Bayesian Lasso Park and Casella (2008)

Model:

y ∼ Nn (X α, I)

Prior:

π(α) = e −γL1 (α)

“µ”= α and “x”= y [γ = 0.37]

ˆγ Then posterior mode at Lasso α Subject 125:

θ125 = x T125 α

How accurate are Bayes posterior inferences for θ125 ?

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

8 / 30

Bayesian Analysis

n o MCMC: posterior sample α∗i for i = 1, 2, . . . , 10, 000 o n Gives θ∗125,i = x T125 α∗i , i = 1, 2, . . . , 10, 000 θ∗125,i ∼˙ 0.248 ± 0.072 General accuracy formula: Other subjects

Bradley Efron (Stanford University)

ˆ = 0.248 frequentist sd 0.071 for E

freq sd < Bayes sd

Frequentist Accuracy of Bayesian Estimates

9 / 30

400 300 0

100

200

Frequency

500

600

700

Figure 2. 10000 MCMC values for Theta =Subject 125 estimate; mean=.248, stdev=.0721; Frequentist Sd for E{theta|data} is .0708, giving Coefficient of Variation 29%

^

[ 0.0

0.1

0.2

^

]



.248 0.3

0.4

0.5

MCMC mu125 values

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

10 / 30

Posterior CDF of θ125

n o Apply GAC to si∗ = I ti∗ < c so E{s|data} = Pr{θ125 ≤ c|data} c = 0.3 :

Bradley Efron (Stanford University)

ˆ = 0.762 ± 0.304 E % % Bayes GAC sd estimate

Frequentist Accuracy of Bayesian Estimates

11 / 30

0.6 0.4 0.0

0.2

Prob{mu125 < c | data}

0.8

1.0

Figure 3. MCMC posterior cdf of mu125, Diabetes data, Prior exp{−.37*L1(alpha)}; verts are +− One Frequentist Standard Dev

[



]

.336

0.0

0.1

0.2

0.3

0.4

c value Upper 95% credible limit is .336 +− .071

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

12 / 30

Exponential Families

    Tˆ fα βˆ = e α β−ψ(α) f0 βˆ

“x”= βˆ and “µ”= α

General Accuracy Formula n o ˆ = E t(α) βˆ For E

α = αx (µ)

    1/2 b = cov t, α βˆ T Vαˆ cov t, α βˆ sd n o ¨ α). with Vαˆ = covα=ˆα βˆ = ψ(ˆ

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

13 / 30

Bayesian Estimation Using the Parametric Bootstrap Efron (2012)

n o Parametric bootstrap: fαˆ (·) → βˆ∗1 , βˆ∗2 , . . . , βˆ∗B n o ˆ = E t(α) βˆ for prior π(α) Want to calculate: E Importance sampling: ˆ E

B X

ti πi Ri

1

    ˆ∗i , πi = π α ˆ∗i , ti = t α

and

,X B

πi Ri

1

Ri = “conversion factor”

    Ri = fαˆ∗i βˆ fαˆ βˆ∗i (Easy in exponential families.) Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

14 / 30

Bootstrap for GAC

.P B pi = πi Ri 1 πj Rj is weight on ith Bootstrap Replication ˆ = PB pi t ∗ E i=1 i     PB ˆ estimates cov α, t βˆ ˆ∗i ti∗ − E cd ov = i=1 pi α  1/2 T b d d sd = cov Vαˆ cov

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

15 / 30

Prostate Cancer Study Singh et al. (2002)

Microarray study: 102 men — 52 prostate cancer, 50 healthy controls 6033 genes zi test statistic for H0i : “no difference” H0i : zi ∼ N(0, 1) Goal:

identify genes involved in prostate cancer

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

16 / 30

200 100

Frequency

300

400

Figure 4. Prostate study: 6033 z−values and matching N(0,1) density

0



3 −4

−2

0

2

4

z value

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

17 / 30

Poisson GLM

Histogram:

49 bins, cj midpoint of bin j

yj = #{zi in bin j} Poisson GLM:

ind

yj ∼ Poi(µj ) log(µ) = poly(c, degree=8)

[MLE:

glm(y ∼ poly(c,8), Poisson)]

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

18 / 30

Bayesian Estimation for the Poisson Model

Model: Prior: cdf:

y ∼ Poi(µ),

µ = eX α

[X = poly(c,8)]

Jeffreys prior for α α → µ → cdf: Fα (z) =

X

µi

,X

µi

ci ≤z

Fdr parameter: t(α) =

1 − Φ(3) = Fdr(3) 1 − F(3)

[MLE: Fdrαˆ (3) = 0.183]

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

19 / 30

5

density

10

15

Figure 5. Prostate study: posterior density for Fdr(3) based on 4000 parametric bootstraps from Poisson poly(8) gave posterior (m,sd) =(.183,.025); Frequentist Sd for .183 equalled .026; CV=14%

0



E{t|x}=.183 0.15

0.20

0.25

0.30

Fdr(3) values Internal coefficient of variation = .0023

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

20 / 30

Model Selection Calculations

Full model:

y ∼ Poi(µ),

Submodel Mm :

µ = eX α

i h α ∈ R9 , µ ∈ R49

{µ : only 1st m + 1 coordinates of β , 0}

Bayesian model selection:

prior probabilities on

and within each Mm Poor man’s model estimates:

partition R49 into

“preference regions“ Rm = {µ closest to Mm } Calculate posterior probabilities Pr{Rm |y}

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

21 / 30

Posterior Model Probabilities

Distance:

minimum AIC from µ to point in Rm

Preferred model:

is one with smallest distance

Rm

4

5

6

7

8

posterior prob

.36

.12

.05

.02

.45

sd

.32

.16

.08

.03

.40

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

22 / 30

Empirical Bayes Accuracy zk = observed value for gene k θk = “true effect size” zk ∼ N(θk , 1) Unknown prior g(·) → θ1 , θ2 , . . . , θN

[N = 6033]

From observations z1 , z2 , . . . , zN we wish to estimate ,Z Z  Ez = E t(θ) z = t(θ)fθ (z)g(θ) dθ fθ (z)g(θ) dθ [if t(θ) = δ0 (θ) then Ez = Pr{θ = 0|z}]. Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

23 / 30

Parametric Families of Priors p-parameter exponential family: log gα (θ) = q(θ)T α + constant 1×p

p×1

α the unknown parameter R Marginal: fα (z) = fθ (z)gα (θ) dθ  T e.g., q(θ) = δ0 (θ), θ, θ2 , θ3 , θ4 , θ5

[“spike and slab”] marginal MLE

ˆ α → gα → fα → (z1 , z2 , . . . , zN ) −→ α .R R ˆ z = t(θ)fθ (z)gαˆ (θ) dθ fθ (z)gαˆ (θ) dθ ±?? E MLE:

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

24 / 30

ˆz Delta-Method Standard Deviation of E (   )1/2  dEz T −1 dEz ˆ Iα sd Ez = dα dα Fisher information matrix: 

I I

¯α = q

R

q(θ)gα (θ) dθ R ¯ ] dθ hα (z) = fθ (z)gα (θ) [q(θ) − q

Z Iα = N

hα (z)hα (z)T dz fα (z)

R dEz ¯ ] dθ where = Ez w(θ)gα (θ) [q(θ) − q dα .R .R w(θ) = t(θ)fθ (z)gα (θ) tfgα − fθ (z)gα (θ) fgα Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

25 / 30

0.6 0.4



0.0

0.2

prob{|theta| >2|z}

0.8

1.0

Figure 6. Estimated prob{abs(theta)>2 | z value} Prostate study; Bars show +− one freq stdev. Using g−model {0,ns(5)}



3 −4

−2

0

2

4

6

z value Prob{abs(theta)>2 | z=3} = .434 +− .071

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

26 / 30

Estimated False Discovery Rate

Local false discovery rate:

 fdr(z) = Pr {θ = 0|z} = E t(θ)|z

where t(θ) = δ0 (θ) Next:

applied to prostate study

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

27 / 30

0.6 0.4

prob{theta=0|z}

0.8

1.0

Figure 7. Estimated prob{theta=0 | z value} Prostate study; Bars show +− one freq stdev. Using g−model {0,ns(5)}

0.0

0.2





3 −4

−2

0

2

4

6

z value Prob{theta=0 | z=3} = .322 +− .068, coeff of var =.21

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

28 / 30

For z = 3:

b {|θ| > 2|z = 3} = 0.43 ± 0.07 Pr c fdr(3) = 0.32 ± 0.07 “locfdr”:

c exfam modeling of z’s, not θ’s, gave fdr(3) = 0.35 ± 0.03

but doesn’t work for |θ| > 2, etc.

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

29 / 30

References

Park and Casella (2008) “The Bayesian Lasso” JASA 681–686 Singh et al. (2002) “Gene expression correlates of clinical prostate cancer behavior” Cancer Cell 203–209 Efron, Hastie, Johnstone and Tibshirani (2004) “Least angle regression” Ann. Statist. 407–499 Efron (2012) “Bayesian inference and the parametric bootstrap” Ann. Appl. Statist. 1971–1997 Efron (2010) Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction IMS Monographs, Cambridge University Press

Bradley Efron (Stanford University)

Frequentist Accuracy of Bayesian Estimates

30 / 30