Deep Belief Networks

10 downloads 40312 Views 1MB Size Report
https://acom.azurecomcdn.net/80C57D/cdn/mediahandler/docarticles/dpsmedia-prod/azure.microsoft.com/en-.
Deep Belief Networks Intro to Deep Neural Networks 26th to 27th August 2016 Supervised By Dr. Asifullah Presented By Muhammad Islam (DCIS, PIEAS)

Pattern Recognition Lab Department of Computer Science & Information Sciences Pakistan Institute of Engineering & Applied Sciences

Motivation: Applications of DBN’s Object Recognition

Deep Belief Network

2

Applications of DBN’s (cont…) • Image Retrieval

Deep Belief Network

3

Applications of DBN’s (cont…) • Document Modeling

Deep Belief Network

4

Applications of DBN’s (cont…) • Document Retrieval

Deep Belief Network

5

Background • Deep neural networks were not absent before 2000

Deep Belief Network

6

Background • However, training deep networks was quite difficult

Deep Belief Network

7

Background • Hence other simple algorithms prevailed

Deep Belief Network

8

Background • Now the situation has changed

Deep Belief Network

9

Background • Deep belief Networks became popular in 2006 • Most prominent work done by Geoffrey Hinton • There were a lot of research

• And now more powerful tools exist

Deep Belief Network

10

Introduction • Deep belief Networks are basically Directed Graphs 2000 units

• Built in the form of stacks using individual units called Restricted Boltzmann Machines

Deep Belief Network

500 units 500 units

28 x 28 pixel image 11

Introduction • Keyword “Belief” indicates an important property

Deep Belief Network

12

Boltzmann Machines • Stochastic generative model • estimate the distribution of observations(say p(image)), instead of their classification p(label|image) • One input layer and one hidden layer • Defined Energy of the network and Probability of a unit’s state Deep Belief Network

13

Restricted Boltzmann Machines • Feed-forward graph structure with two layers • visible layer (binary or Gaussian units) and hidden layer (usually binary units) • No intra layer connections • Visible units and hidden units are conditionally independent Deep Belief Network

14

BM vs RBM

Hidden layer, h

Visible layer, v

Deep Belief Network

15

Restricted Boltzmann Machines • Two characters define an RBM:

• states of all the units: obtained through probability distribution.

• weights of the network: obtained through training (Contrastive Divergence) Deep Belief Network

16

Restricted Boltzmann Machines • Energy is defined for the RBM as:

E (v, h)   ai vi   b j h j   h j wi , j vi i

j

i

j

Where E is the energy for given RBM and ai , bi and Wi represent weights for hidden layer bias, weights for visible layer bias and combined weights respectively.

Deep Belief Network

17

Restricted Boltzmann Machines • Distribution of visible layer of the RBM is given by 1 P (v )   e  E ( v , h ) Z h

Where Z is the partition function defined as the sum of  E ( v ,h ) over all possible configurations of {v,h} e • Probability that a hidden unit i is on(binary state 1) is m

P(h j  1 | v)   (b j   wi , j vi ) i 1

Deep Belief Network

18

Restricted Boltzmann Machines for calculating a particular weight between two units

logp (v)  vi h j  data   vi h j  model wij and

 logp (v)   wij     w  ij  

hence

wij   ( vi h j  data   vi h j  model ) Deep Belief Network

19

Training an RBM

Deep Belief Network

20

Contrastive Divergence

Deep Belief Network

21

Training DBN’s • First train a layer of features that receive input directly from the pixels. • Then treat the activations of the trained features as if they were pixels and learn features of features in a second hidden layer.

• It can be proved that each time we add another layer of features we improve a variational lower bound on the log probability of the training data. Deep Belief Network

22

Training DBN’s

Deep Belief Network

23

References •

DBN lecture by Geoffrey Hinton; Vedios and slides at http://videolectures.net/mlss09uk_hinton_dbn/



Figures: 

http://cvn.ecp.fr/personnel/iasonas/course/DL5.pdf • •



https://www.cs.toronto.edu/~hinton/nipstutorial/nipstut3.pdf •



BM vs RBM: slide 7, Document Modeling: slide 51, Document Retrieval: slide 53, Image retrieval : slide 56 (given slide numbers are related to source documents.)

Training RBM’s: slide 20, contrastive divergence: 21, digit recognition of 2: slide 24, DBN model in introduction: slide 55,

Misc: • • • •

• •

http://www.cs.nyu.edu/~yann/research/norb/training-testing.png http://www.cloudpointtech.com/wp-content/uploads/2015/09/Old-Computers.jpg https://acom.azurecomcdn.net/80C57D/cdn/mediahandler/docarticles/dpsmedia-prod/azure.microsoft.com/enus/documentation/articles/machine-learning-algorithm-choice/20160816064407/image2.png https://acom.azurecomcdn.net/80C57D/cdn/mediahandler/docarticles/dpsmedia-prod/azure.microsoft.com/enus/documentation/articles/machine-learning-algorithm-choice/20160816064407/image7.png http://ogrisel.github.io/scikit-learn.org/sklearn-tutorial/_images/plot_mean_shift_11.png http://docs.nvidia.com/cuda/cuda-c-programming-guide/graphics/floating-point-operations-per-second.png

Deep Belief Network

24

Thanks

Deep Belief Network

25