Bayesian Sparsity for Intractable Distributions - Semantic Scholar

Recommend Documents

Bayesian Networks with Continious Distributions - Semantic Scholar

Bayesian Networks with Continious Distributions. Sven Laur. March 17, 2009. 1 Theoretical Background. Although it is com

Bayesian computation for statistical models with intractable ...

Suppose we have a data set x0 ∈ (X,B(X)) generated from a statistical model with density. eE(x,θ)λ(dx)/Z(θ),. ∗Department of Statistics, University of Michigan, ...

Sparsity Constrained Sinogram Inpainting for ... - Semantic Scholar

projections corrupted by metallic implants. ... still available projections for the recovery of missing ones. On ... that corrupt the projection data at the locatio.

Definition of intractable epilepsy - Semantic Scholar

infections, traumatic lesions, and neoplasms (such as gangliogliomas, dysembryoplastic neuroepithelial tumor, and oligodendrogliomas).34,35 Similarly, certain.

Exact sampling for intractable probability distributions via a

Keywords and phrases: Markov chain, Monte Carlo, perfect sampling,. Bernoulli ...... requires further improvements, or a lot of patience. ..... factory-sampling.pdf.

Semantic Smoothing for Bayesian Text ... - Semantic Scholar

hot research topic with the rapid increase of text in digital form such as web pages, .... of three types of topic signatures (i.e., word, multiword phrases, and ...

Intractable Ascites Management: The Role - Semantic Scholar

Arterial vasodilatation leads to an abnormal distribution of blood volume with reduction of effective arterial blood volume and subsequent renal sodium and ...

Stable Distributions - Semantic Scholar

4.12 Fitting stable distributions to concentration data . . . . . . . . . . . . . . . 182 .... III Regression, Time Seri

Large-Sample Bayesian Posterior Distributions for Probabilistic ...

control and treatment parameters. As an example, we apply these methods to conduct a probabilistic sensi- tivity analysis for a recently published analysis by.

Bayesian design of experiments for intractable likelihood models ...

Mar 19, 2018 - (2016a) which in turn was based on experiments ran by Denham et al. ...... E. and Robert, C. P. (2006) Bayesian-optimal design via interacting ...

Intrathoracic gossypiboma causing intractable cough - Semantic Scholar

Nov 23, 2011 - Upon computerized tomography (CT) scanning, a sponge-like structure was seen in the pneumonectomy cavity near the stump of the right main ...

old intractable natural history specimens - Semantic Scholar

and CHRISTOPHER C. AUSTIN*â . *Museum of Natural Science, ..... Holmes MW, Hammond TT, Wogan GOU et al. (2016) Natural history collections as ...

Reference Posterior Distributions for Bayesian Inference

A procedure is proposed to derive reference posterior distributions which approxi- mately describe the inferential content of the data without incorporating any .... However, for the sake of simplicity, we shall be using the deï¬nition in density ..

Bayesian Entropy Estimation for Countable Discrete Distributions

Apr 16, 2013 - Distributions. Evan Archerâ [email protected] ... Jonathan W. Pillow [email protected] ... Archer, Park, and Pillow parameter distribution.

Hemimegalencephaly and Intractable Epilepsy ... - Semantic Scholar

Hemimegalencephaly is a rare dysplastic malformation ... and cortical dysplasia are present on the right as well. .... nevus syndrome with hemimegalencephaly.

A Bayesian approach for inducing sparsity in generalized linear ...

Sep 25, 2015 - In this study, we utilized the Generalized Double Pareto (GDP) prior to induce sparsity in a Bayesian. Generalized Linear Model (GLM) setting.

Bayesian Multiple Testing Under Sparsity for Polynomial-Tailed ...

Jul 28, 2016 - polynomial-tailed distributions satisfying a monotone likelihood ratio prop- .... the procedures controlling Bayesian false discovery rate (BFDR), ...

Some Observations on Sparsity Inducing ... - Semantic Scholar

Department of Computer Science, University College London, Malet Place London WC1E ..... This is a setting which is relevant, for example, in online .... hence a linearly convergent gradient descent or accelerated gradient descent method.

SPARSITY MAXIMIZATION UNDER A ... - Semantic Scholar

This paper considers two problems in sparse filter design, the first in- volving a least-squares ..... We used a custom

SPARSITY MAXIMIZATION UNDER A ... - Semantic Scholar

filters, as a means of reducing the costs of implementation, whether in the form of ... In Section. 2 it is shown that b

A Hierarchical Bayesian Framework for Constructing Sparsity-inducing ...

Sep 16, 2010 - [6] Rick Chartrand and Wotao Yin. Iteratively reweighted algorithms ... [17] Pierre Druilhet and Jean-Michel Marin. Invariant HPD credible sets ...

bayesian epistemology - Semantic Scholar

Aug 17, 2010 - gradual aspects of these central epistemological notions into account. ..... This unintuitive property of absolute evidence calls for a second, different ..... Hartmann 2003a, Douven and Meijs 2007, Meijs 2005, Siebel 2005).

Bayesian Interpolation - Semantic Scholar

May 18, 1991 - David J.C. MacKay. Computation and Neural Systems. California Institute of Technology 139{74. Pasadena CA 91125 [email protected].

Probability Distributions for Channel Utilisation - Semantic Scholar

carefully chosen probability distribution maximising the expected number of suc- cessfully delivered packets up to a constant factor. Combined with a standard ...

Bayesian Sparsity for Intractable Distributions - Semantic Scholar

Download PDF

1 downloads 0 Views 5MB Size Report

Comment

Feb 11, 2016 - DEBBIE@HMS.HARVARD.EDU. Harvard Medical School. Abstract. Bayesian approaches for single-variable and group-structured sparsity ...

Bayesian Sparsity for Intractable Distributions

arXiv:1602.03807v1 [stat.ML] 11 Feb 2016

John B Ingraham Debora S Marks Harvard Medical School

Abstract Bayesian approaches for single-variable and group-structured sparsity outperform L1 regularization, but are challenging to apply to large, potentially intractable models. Here we show how noncentered parameterizations, a common trick for improving the efficiency of exact inference in hierarchical models, can similarly improve the accuracy of variational approximations. We develop this with two contributions: First, we introduce Fadeout, an approach for variational inference that uses noncentered parameterizations to capture a posteriori correlations between parameters and hyperparameters. Second, we extend stochastic variational inference to undirected models, enabling efficient hierarchical Bayes without approximations of intractable normalizing constants. We find that this framework substantially improves inferences of undirected graphical models under both sparse and group-sparse priors.

1. Introduction Hierarchical priors that favor sparsity have been a central development in modern statistics and machine learning. While they have a long history within the confines of statistics, it was perhaps L1 -regularization and the LASSO (Tibshirani, 1996) that helped instantiate sparsity-promoting priors as standard tools of applied science and engineering. Nevertheless, L1 is a pragmatic compromise; by being the closest convex approximation of the idealized L0 norm, it cannot model the hypothesis of sparsity as well as some Bayesian alternatives. Bayesian sparsity-promoting priors can yield considerably better results (Tipping, 2001), but only when Bayesian inference is tractable. Two Bayesian approaches stand out as more accurate models of sparsity than L1 . The first, the spike and slab

INGRAHAM @ FAS . HARVARD . EDU DEBBIE @ HMS . HARVARD . EDU

(Mitchell & Beauchamp, 1988), introduces discrete latent variables that directly model the presence or absence of each parameter. This discrete approach is typically viewed as the ideal prior for sparsity (Mohamed et al., 2011), but the discrete latent space that it imposes is often too significant a barrier for high-dimensional models. The second approach to Bayesian sparsity uses the scale mixtures of normals (Andrews & Mallows, 1974), a family of distributions that arise from integrating a zero meanGaussian over an unknown variance Z ∞ θ2 1 √ exp − 2 p(σ)dσ (1) p(θ) = 2σ 2πσ 0 Scale-mixtures can approximate the discrete spike and slab prior by mixing both large and small values of σ. L1 ’s implicit prior, the Laplacian, is a member of this family and results from exponentially distributed variance σ 2 . Thus, mixing densities p(σ 2 ) with subexponential tails and more mass near the origin can better accomodate sparsity than L1 and are the basis for approaches often referred to as “Sparse Bayesian Learning” (Tipping, 2001). The Studentt and Horseshoe (Carvalho et al., 2010) both incorporate these properties, but unfortunately hinder MAP inference with non-convexities and typically require the use of either HMC or augmentation-based approaches (Polson et al., 2013). Applying these favorable, Bayesian approaches to sparsity has been particularly challenging for discrete, undirected models like Boltzmann Machines. Undirected models possess a representational advantage of capturing ‘collective phenomena’ with no clear directions of causality, but their likelihoods require an intractable normalizing constant (Murray & Ghahramani, 2004). For a fully observed Boltzmann Machine with x ∈ {0, 1}D the distribution1 is   X  1 p(x|J) = exp Jij xi xj (2)   Z(J) i