Deep Belief Networks

Deep Belief Networks Intro to Deep Neural Networks 26th to 27th August 2016 Supervised By Dr. Asifullah Presented By Muhammad Islam (DCIS, PIEAS)

Pattern Recognition Lab Department of Computer Science & Information Sciences Pakistan Institute of Engineering & Applied Sciences

Motivation: Applications of DBN’s Object Recognition

Deep Belief Network

2

Applications of DBN’s (cont…) • Image Retrieval

Deep Belief Network

3

Applications of DBN’s (cont…) • Document Modeling

Deep Belief Network

4

Applications of DBN’s (cont…) • Document Retrieval

Deep Belief Network

5

Background • Deep neural networks were not absent before 2000

Deep Belief Network

6

Background • However, training deep networks was quite difficult

Deep Belief Network

7

Background • Hence other simple algorithms prevailed

Deep Belief Network

8

Background • Now the situation has changed

Deep Belief Network

9

Background • Deep belief Networks became popular in 2006 • Most prominent work done by Geoffrey Hinton • There were a lot of research

• And now more powerful tools exist

Deep Belief Network

10

Introduction • Deep belief Networks are basically Directed Graphs 2000 units

• Built in the form of stacks using individual units called Restricted Boltzmann Machines

Deep Belief Network

500 units 500 units

28 x 28 pixel image 11

Introduction • Keyword “Belief” indicates an important property

Deep Belief Network

12

Boltzmann Machines • Stochastic generative model • estimate the distribution of observations(say p(image)), instead of their classification p(label|image) • One input layer and one hidden layer • Defined Energy of the network and Probability of a unit’s state Deep Belief Network

13

Restricted Boltzmann Machines • Feed-forward graph structure with two layers • visible layer (binary or Gaussian units) and hidden layer (usually binary units) • No intra layer connections • Visible units and hidden units are conditionally independent Deep Belief Network

14

BM vs RBM

Hidden layer, h

Visible layer, v

Deep Belief Network

15

Restricted Boltzmann Machines • Two characters define an RBM:

• states of all the units: obtained through probability distribution.

• weights of the network: obtained through training (Contrastive Divergence) Deep Belief Network

16

Restricted Boltzmann Machines • Energy is defined for the RBM as:

E (v, h)   ai vi   b j h j   h j wi , j vi i

j

i

j

Where E is the energy for given RBM and ai , bi and Wi represent weights for hidden layer bias, weights for visible layer bias and combined weights respectively.

Deep Belief Network

17

Restricted Boltzmann Machines • Distribution of visible layer of the RBM is given by 1 P (v )   e  E ( v , h ) Z h

Where Z is the partition function defined as the sum of  E ( v ,h ) over all possible configurations of {v,h} e • Probability that a hidden unit i is on(binary state 1) is m

P(h j  1 | v)   (b j   wi , j vi ) i 1

Deep Belief Network

18

Restricted Boltzmann Machines for calculating a particular weight between two units

logp (v)  vi h j  data   vi h j  model wij and

 logp (v)   wij     w  ij  

hence

wij   ( vi h j  data   vi h j  model ) Deep Belief Network

19

Training an RBM

Deep Belief Network

20

Contrastive Divergence

Deep Belief Network

21

Training DBN’s • First train a layer of features that receive input directly from the pixels. • Then treat the activations of the trained features as if they were pixels and learn features of features in a second hidden layer.

• It can be proved that each time we add another layer of features we improve a variational lower bound on the log probability of the training data. Deep Belief Network

22

Training DBN’s

Deep Belief Network

23

References •

DBN lecture by Geoffrey Hinton; Vedios and slides at http://videolectures.net/mlss09uk_hinton_dbn/

•

Figures: 

http://cvn.ecp.fr/personnel/iasonas/course/DL5.pdf • •



https://www.cs.toronto.edu/~hinton/nipstutorial/nipstut3.pdf •



BM vs RBM: slide 7, Document Modeling: slide 51, Document Retrieval: slide 53, Image retrieval : slide 56 (given slide numbers are related to source documents.)

Training RBM’s: slide 20, contrastive divergence: 21, digit recognition of 2: slide 24, DBN model in introduction: slide 55,

Misc: • • • •

• •

http://www.cs.nyu.edu/~yann/research/norb/training-testing.png http://www.cloudpointtech.com/wp-content/uploads/2015/09/Old-Computers.jpg https://acom.azurecomcdn.net/80C57D/cdn/mediahandler/docarticles/dpsmedia-prod/azure.microsoft.com/enus/documentation/articles/machine-learning-algorithm-choice/20160816064407/image2.png https://acom.azurecomcdn.net/80C57D/cdn/mediahandler/docarticles/dpsmedia-prod/azure.microsoft.com/enus/documentation/articles/machine-learning-algorithm-choice/20160816064407/image7.png http://ogrisel.github.io/scikit-learn.org/sklearn-tutorial/_images/plot_mean_shift_11.png http://docs.nvidia.com/cuda/cuda-c-programming-guide/graphics/floating-point-operations-per-second.png

Deep Belief Network

24

Thanks

Deep Belief Network

25