Deep Learning Based Digital Signal Modulation ...

5 downloads 15293 Views 489KB Size Report
Deep Learning Based Digital Signal. Modulation Recognition. Junqiang Fu, Chenglin Zhao, Bin Li, Xiao Peng. Key Lab of Universal Wireless Communications, ...
Deep Learning Based Digital Signal Modulation Recognition Junqiang Fu, Chenglin Zhao, Bin Li, Xiao Peng Key Lab of Universal Wireless Communications, MOE Wireless Network Lab Beijing University of Posts and Telecommunications Beijing, China E-mail: [email protected]

Abstract. In this investigation, we proposed a promising digital signal modulation recognition scheme which is inspired by the deep learning. Firstly, the signal discriminations are constructed, which is composed of the full temporal characteristics of digital signals, its frequency spectrum as well as several higher-order spectral characteristics. Subsequently, the deep learning algorithm, with the powerful ability of interpretations and learning, is further suggested to realize modulation recognitions. A major advantage of this new scheme is that it may fully exploit the complete information of digital signals, rather than only utilizing several extracted features. It is verified by experimental simulations that the recognition accuracy of the proposed new scheme is much superior to other traditional recognition methods, which therefore provides an attractive approach to realistic modulation recognition. Keywords: modulation recognition; spectral characteristics; amplitude characteristics; deep learning

1.

INTRODUCTION

In the complex and diverse communications environment, automatic digital signal modulation recognition has become key technologies of intelligent signal processing and analysis. Nowadays, the applications of modulation recognition have been invading almost every domain of commercial and military communications. In general, there are three kinds of traditional methods, i.e., decision tree classifier, cluster analysis, neural network classifier. Nandi [1] and Azzouz [2] proposed a decision tree algorithm which makes a hard decision based on several characteristic parameters. It has low computational complexity and is intuitive, which, however, is susceptible to noise. Therefore, in practice, it has to usually be combined with some other methods. Cluster analysis [3] is a multivariate statistical classification method, which blindly classifies unlabeled samples according to the pattern similarity among them. In most case, it will use the Euclidean distance to characterize the signal similarity. The smaller the distance is, the higher the similarity. Unfortunately, cluster analysis seems to

be noise sensitive and, accordingly, the identification performance will be influenced significantly by the extraction of characteristic parameters. As the most popular method, the back propagation (BP) [4] and radial basis function (RBF) [5] neural networks (NN) have the ability self-learning and generalization which is hence suitable to identify the underlying nonlinear mapping of input signal and output classification. However, it is easy to fall into some local optimal solutions. Besides, it may have a slow decline in the convergence rate closed to the optimal solution and also the poor generalization ability and low recognition rate in low signal to noise ratio (SNR). In this investigation, we proposed a promising digital signal modulation recognition scheme based on the learning scheme. In sharp contrast to existing NN schemes, the deep learning algorithm is suggested to further promote the interpretation and classification capability. Firstly, the signal discriminations are constructed, which is composed of the full temporal characteristics of digital signals, its frequency spectrum as well as several higher-order spectral characteristics. Subsequently, the deep learning [6] algorithm, with the powerful ability of interpretations and learning, is further suggested to realize modulation recognitions. A major advantage of this new scheme is that it may fully exploit the complete information of digital signals, rather than only utilizing several extracted features. It is verified by experimental simulations that the recognition accuracy of the proposed new scheme is much superior to other traditional recognition methods, which therefore provides an attractive approach to realistic modulation recognition.

2.

SIGNAL CHARACTERISTIC EXTRACTION

A. Characteristics of the signal frequency spectrum This paper processed automatic modulation identification of several commonly used methods, including 2ASK, 4ASK, 4ASK, 2PSK, 4PSK, 8PSK, 2FSK, 4FSK, 8FSK, 16QAM and 64QAM signals, by utilizing differences in time-domain, frequency spectrum and higher-order spectrum between several modulations in the AWGN channel. Digital sampling signal in time-domain can be represented as K

s(n)   ak cos(2 f 0 nTs   (n)), n  1, 2,..., N ,

(1)

k 1

where N is the total number of sampling points, K is the number of sampling point per symbol, Ts is the sampling period, f0 is the carrier frequency, φ(n) is the carrier phase and ak is the symbol amplitude. Then the signal frequency spectrum is given by the Fourier transform (2) F (w)  FFT s(n)  s(n),exp( jwn) For MFSK signals, there are M peaks the signal frequency spectrum. But there is only one for other signals. The traditional methods must extract the number of

1

1

0.8

0.8

0.8

0.6

0.6

0.6

F(w)

1

F(w)

F(w)

peaks to achieve modulation recognition first. Therefore, the extraction of the carrier directly affects the recognition rate of the system. For the MASK signals, there will be a large impulse of the carrier. For the MPSK and MQAM signal, however, the spectrum line near the carrier will decline slowly and not appear obvious pulsing because of sudden phase anomalies (SPA).

0.4

0.4

0.4

0.2

0.2

0.2

0

0

10 f/KHz

0

20

0

(a) 2FSK

10 f/KHz

0

20

(b) 4FSK

0.8

0.8

0.8

0.6

0.6

0.6

0.2

0.2

0

0

10 f/KHz

0

20

F(w)

F(w)

1

F(w)

1

0.4

20

0.4 0.2

0

(d) MASK

Fig.1.

10 f/KHz

(c) 8FSK

1

0.4

0

10 f/KHz

20

0

(e) MPSK

0

10 f/KHz

20

(f) MQAM

Signal frequency spectrum

B. Time-domain amplitude characteristics In time-domain MPSK and MFSK signals are constant envelope signals, but MASK and MQAM signals exist amplitude variation. The normalized instantaneous amplitude is picked up for the modulation recognition. In (1) we can get the absolute sum value of all sample points amplitude within each symbol ( k 1)*m  m 1

Mk =



n  k 1

cos(2 f 0 nTs   (n)) ,

(3)

where m=N/K is the number of sampling point per symbol. The energy-normalized amplitude is given as follows

k  M

k

M

k

K,

(4)

k

Then αk are sorted as a feature vector. It is easy and quick to distinguish 2ASK, 4ASK and 8ASK, and 16QAM and 64QAM from MQAM signals.

1.5

1

|s(n)|

1.5

|s(n)|

|s(n)|

1.5

1

0.5

0.5

0.5

0

50 n

100

0

50 n

(a) 2ASK

0

100

(b) 4ASK

2

2

2

1.5

1.5

1.5

1

1

0.5 0

50 100 n

Fig.2.

0.5 0

(d) MPSK

100

|s(n)|

1

0.5

50 n

(c) 8ASK

|s(n)|

|s(n)|

1

50 n

100

0

(e) 16QAM

50 100 n

(f) 64QAM

Signal time-domain normalized amplitude

C. Higher-order spectral characteristics Square spectrum of BPSK appears at a noticeable discrete spectral line in double carrier frequency .Four order spectrum of QPSK and 8PSK have different characteristic line in quadruplicated carrier frequency.8PSK signals exist relatively high lines near the zero frequency. 4PSK signals, however, do not. The higher-order spectrum can be given as (5) P2 (w)  FFT{s 2 (n)}, (6)

1

1

1

0.8

0.8

0.8

0.6

0.6

0.6 0.4

0.4

0.4

0.2

0.2

0.2 0

P2(w)

P2(w)

P2(w)

P4 (w)  FFT{s 4 (n)}.

0

10 f/KHz

20

(a) 2PSK Fig.3.

0

0 0

10 f/KHz

20

(b) 4PSK

0

10 f/KHz

(c) 8PSK

Two-order frequency spectrum

20

1

0.8

0.8

0.8

0.6

0.6

0.6

0.4 0.2 0

0.4

0

20 f/KHz

40

Fig.4.

0

0.4 0.2

0.2

(a) 2PSK

3.

P4(w)

1

P4(w)

P4(w)

1

0

20 f/KHz

40

(b) 4PSK

0

0

20 f/KHz

40

(c) 8PSK

Four-order frequency spectrum

DEEP LEARNING ALGORITHM

The deep learning is a neural network, which contains many hidden units between its input layer and its output layer. Pre-training, an unsupervised and undirected Restricted Boltzmann Machine [7] (RBM) process, is essential to the next training. Training has two parts: top-down generation model and down-top recognition model. Softmax W4 1000

RBM

1000

W3

300

300

300

GRBM

W3

W2

RBM

1000

300

W2T

W2

300

W1

W3T

300

W1

W1T

Fig.5. First, a GRBM is trained to model input feature vectors. Then generated binary states of hidden units are used for training next RBM. The top-down model has directed connection but the top layer is a undirected top-down RBM. Finally, the down-top model has a “Softmax” classifier for each modulation.

D. Pre-training A RBM consists of a binary visible layer that represents binary data and a binary hidden layer and has no connections in a layer and undirected connection between two lays. It uses a parameters of weight, W, to set the joint probability of visible vectors, v, and hidden vector, h, by a energy function, E

p(v, h;W )  exp( E (v, h;W )) Z , Z   v ',h ' exp( E (v ', h ';W )), where Z is a normalized factor. The joint (v, h) in a RBM defines a energy E (v, h)    ai vi   b j h j   vi h j wij . ivisible

jhidden

(7)

(8)

i, j

where ai, bj is respectively the bias of visible unit, vi, and hidden unit, hj, wij is a connection to unit, j, from unit, i, in the lower layer. Given hidden vector, h, the probability of visible units, vi, is given as follows (9) p(v)   h exp( E (v, h)) Z . The change in a weight is linear with the derivate of log probability

wij   ( vi h j data   vi h j  model ) =

1 N

n N

 n 1

 log p(v n ) , wij

(10)

where ε is a learning factor. However,  vi h j model will be difficult to get and be replaced by  vi h j recon . Given a random binary vector, v, the hidden unit, hj, is reconstructed by conditional probability (11)

p(h j  1| v)  logistic(b j  i vi wij ),

And then given a hidden vector, h, the visible unit, vi, is reconstructed by

p(vi  1| h)  logistic(ai   j hi wij ).

(12)

The learning rule for a weight is given by

wij   ( vi h j data   vi h j recon ).

(13) RBM which input vector is real vale with Gaussian noise is called GaussianBernoulli RBM (GRBM) .Two conditional probabilities for GRBM are given as follows

p(h j | v)  logistic(b j  i vi  i wij ),

p(vi | h)  N (ai   i  j hi wij ,  i 2 ).

(14) (15)

E. Top-down generation model After pre-training, we train network on a directed connection in a mandatory registration. Each hidden unit utilizes logistic function as a transfer function. We input xj to get output yj by a weight, wij y j  logistic( x j )  1 (1  exp( x j )), (16)

x j  b j  i yi wij .

(17)

A top-down generation model has a top-down, directed connection, but the top two layers are a top-down RBM. The RBM can get a probability (18) p(v;W )  p(h;W ) p(v | h;W ).



h

It is required to reverse the roles of the visible and hidden layer and remain p(v|h;W) fixed. Then p(h;W) is replaced with a hidden vector inferred using (14) from a sampled visible vector, v.

F. Down-top recognition model A down-top recognition model has a down-top directed connection and adds a “Softmax” multiclass classifier that contains one unit for each modulation in the top layer. A class probability, is given by using the softmax function (19) p j  exp( x j ) xn .



n

Then it defines a cost function

C   j d j log p j ,

(20)

Where dj, taking values of 0 or 1, is target probability. A weight is updated by

wij (n)  wij (n  1)   C wij (n),

(21)

where 0