Traffic Prediction and Network Resources ... - Semantic Scholar

1 downloads 0 Views 83KB Size Report
is proposed to relate the rates of I, P and B frames. Experimental results ... source, namely Constant Bit Rate (CBR) and Variable Bit. Rate (VBR) respectively [1], ...
Traffic Prediction and Network Resources Estimation of VBR MPEG-2 Sources Using Adaptively Trained Neural Networks Anastasios D. Doulamis, Nikolaos D. Doulamis and Stefanos D. Kollias

Abstract—In this paper, a unified non-linear modeling is proposed appropriate both for on-line traffic prediction and network resources estimation in case of VBR MPEG-2 coded video sources. A feedforward neural network architecture with tapped delay inputs is adopted to implement the nonlinear model structure. For on-line traffic modeling, a weight adaptation algorithm is activated, to modify the model parameters, in case of abrupt changes of the traffic statistics, where the local characteristics may change. Furthermore, in case of off-line traffic modeling, where the system is used as network resource estimator, an error correlation mechanism is proposed to relate the rates of I, P and B frames. Experimental results with real life video sequences of long duration are presented to show the good performance of the proposed scheme both as traffic rate predictor and network resource estimator. Index terms—On-line, off-line traffic modeling, VBR MPEG-2 video sources, nonlinear models

I. INTRODUCTION Two main different modes are used for encoding any video source, namely Constant Bit Rate (CBR) and Variable Bit Rate (VBR) respectively [1], [2]. In a VBR transmission mode, the output rate fluctuates according to video activity in order to retain a constant picture quality. Instead, in a CBR transmission mode, a rate control mechanism is activated, in case of high or low activity to guarantee a target output rate, varying, however, picture quality. For an engineering point of view, a CBR transmission mode is easier to be implemented than a VBR mode. This is due to the fact that communication lines are of constant capacity and thus, it is straightforward to multiplex several independent CBR sources in a common line. However, users are frequently interested in constant picture quality regardless of the scene complexity and thus usually prefer a VBR coding mode than a CBR one. To transmit, however, VBR sources in a channel of constant capacity, the The authors are with the national technical university of Athens (NTUA), department of electrical and computer engineering, 9 heroon polytechniou Str., 15773 zografou, Athens, Greece. Corresponding author email: [email protected].

bandwidth should be selected equal to the source peak rate, resulting in a wasteful allocation scheme. To overcome this difficulty, we multiplex several VBR sources in a common buffer of constant output rate so that the aggregate rate tends to smooth out around the average (statistical multiplexing) [3]. Then, losses occur in case of buffer overflow, resulting in degradation of picture quality [4]. For a given quality, however, the loss probability should not exceed a certain limit and thus it is necessary to build some protection against losses so that an acceptable Quality of Service (QoS) level is guaranteed to the users. As a result, efficient network operation requires statistical characterization and modeling of the transmitted information. In general, two cases can be discriminated. The first concerns the development of statistical models able to (i) capture traffic statistics, (ii) simulate traffic behavior and (iii) estimate network resources with high accuracy. Such models can be used, for example, as video generators, to select appropriate network parameters during the network design phase, such as utilization, and/or number of multiplexed sources that achieve an acceptable video quality and/or delay (off-line traffic modeling). The second case regards the network operation phase, where the models are applied to predict future rates based on previous actual samples of the traffic (on-line traffic modeling). Traffic management algorithms and congestion control schemes, which prevent the network from possible overload or from violation of the negotiated QoS parameters, can be gained for such modeling. Several VBR video models have been proposed in the literature for both cases. Some typical works for off-line traffic modeling have been reported in [3]-[8]. However, all the aforementioned models cannot be directly applied to VBR MPEG-2 coded video sources since different coding methods result in different traffic statistics [2]. Some statistical properties and basic characteristics of MPEG coded video streams have been recently analyzed, such as the higher average rate of Intra frames than of Inter ones or the periodicity existing in the autocorrelation function of MPEG sequences [9]-[12]. Linear models, the parameters

of which are associated with different video activity or content have been proposed in [13] and [14] for VBR MPEG sources. However, the above-mentioned models cannot be applied to the problem of on-line traffic modeling since they are oriented to capturing only the traffic statistics. Several works have been proposed in the literature dealing with the problem of on-line traffic modeling using either linear or non-linear models [2], [15], [16]. However, the algorithms adopted are proper for "smooth" video traffic like, for example, videoconferencing sequences or video sources consisting of one shot and they cannot easily be extended to long duration sequences where high rate variation is encountered. Furthermore, they are not suitable for the problem of off-line traffic modeling. In this paper, a unified non-linear model is presented appropriate both for on-line and off-line traffic prediction. The model is based on a recursive implementation of a nonlinear autoregressive system (NAR). The optimal meansquared error predictor of the NAR model is implemented using a feedforward neural network with a tapped delay line (TDL) filter as its input. First, the proposed scheme is used for on line traffic prediction. In this case, an efficient weight adaptation algorithm is proposed to improve the network performance, especially at highly varied frame rates and simultaneously to provide the system with better tracking capabilities. Then, the same neural network architecture is applied to the off-line traffic modeling case.

Figure 1. A graphical representation of an off-line traffic modeling architecture. 1-lag Delay

[.^ QS r c (n ) Neural Network Model

+

[^ . Q

1-lag Delay [.^ Q

Figure 2. A graphical representation of an off-line traffic modeling architecture. In the MPEG-2 video traffic, three different coding modes are available; the Intraframe (I), the Predictive (P) and the Bi-directionally predictive (B). Since these types of frames are generated using different coding schemes, they are characterized by different statistical properties. For example, the size of I frames is usually much greater than the size of P and B frames since I frames are compressed only in spatial direction. Instead, P and B frames present the highest fluctuation due to the fact that it is possible, especially in high motion activities, to be coded as Intra frames, i.e., without applying the motion estimation/ compensation algorithm. For this reason, three different models are used both for on-line and off-line traffic modeling, each of which is appropriate for a specific type of frame.

II. ON LINE/OFF LINE TRAFFIC MODELING Figure 1 illustrates a block diagram for an on-line traffic modeling system. The actual video samples are fed to the model during a real-time video transmission, while its output predicts future samples of the traffic. The accuracy of the model is evaluated at the system output, where the prediction error is estimated. An off-line traffic modeling system is depicted in Figure 2. As can be seen, the model operates in a recursive autonomous mode. It is clear that in this case, the model accuracy depends on the error statistics, which are fed as inputs to the system. The statistics, however, are similar to the error statistics used in the online modeling case. The only difference is that the error is generated independently since the actual samples are not available. It should be mentioned that the model output in the case of off-line traffic modeling does not correspond to the actual sample of the traffic, but only simulates and captures the statistics of the MPEG video sequences, such as the variance, autocorrelation function and so on.

x c (n -p )

...

x c (n -1)

Neural Network Model

x c (n ) x c (n ) ^

-

Prediction Error e c (n)

In the on-line traffic modeling case, however, it is often useful to predict the activity of the video traffic, instead of the actual size of each type of frame. This is due to the fact that in a VBR transmission mode, periods of high video activity overload the network buffers, while periods of low activity empty the buffers. As an estimation of the video activity for MPEG-2 coded data, the average frame rate over a group of picture period (COP) can be used since it can be seen in [11] that this signal presents the same behavior as the actual activity. III. MPEG-2 NON LINEAR MODELING In our paper, a non-linear autoregressive model is used both for off-line and on-line traffic modeling. The non-linear model is implemented within any acceptable accuracy, by a feedforward neural network architecture with a tapped delay line filter as its input [17]. The number of delay inputs expresses the order of the model, i.e., how many past samples should be taken into account to predict the rate of the current sample of the traffic. Due to different statistical properties of the three types of frames composing the MPEG stream, the order of I frames is smaller than the order of P and B frames. This is due to the fact that I frames are coded without any reference to other frames, while P and B frames present the highest fluctuation rate.

Based on the previous observations, it seems that the following relation is held (1) xˆ c (n) = g c ( x c (n − 1),..., x c (n − p c )) + e c (n), where xˆ c (n) represents the estimated frame rate of the c ∈ {I , P, B} frame type at the nth (current) bit rate, while c

c

x (n) the actual frame rate. Variable p represents the

order of the model. Function g c (⋅) expresses the nonlinear function which relates the previous traffic rates with the current one. ec (n) is the an i.i.d stochastic process which indicates the prediction error of the model. Function g c (⋅) is approximated by the feedforward neural network structure using an appropriate training set. In the following all the previous samples are included in a vector, say x(n − 1) , which has the form (2) x(n − 1) = [ x(n − 1)  x(n − p ) 1] The unit element in the vector x(n − 1) has been included to accommodate the bias effect of the neural network structure. For network training, a set of K samples has been used, {x(n − 1), x(n)}nn == Kp ++1p to approximate the unknown T

function g c (⋅) . The actual value x(n) has been included in the previous set as desired response of the network output and the previous pc-samples of the signal, i.e., vector x(n1), are considered as inputs. The network is trained to minimize the mean squared value of the error for all samples in the training set. For the network training, the Marquardt-Levenberg algorithm has been used [17]. The choice of this algorithm is based on its efficiency and fast convergence since it was designed to approximate second order training speed without having to compute the Hessian matrix. Furthermore, a cross validation method has been adopted to reduce the possibility for the network to memorize training samples, without being able to generalize to new situations (bad generalization performance) [19]. A. Adaptive Neural Network Training In case of on-line traffic prediction, the model accuracy is improved through a weight adaptation algorithm, which updates the model parameters through time. This is due to the fact that in a real-time MPEG video transmission, the statistics of frame rates may locally be changed deteriorating the network performance in those time instances. The proposed adaptation scheme modifies the network so that it appropriately responses to new data, and also provides a minimal degradation of the old information. Otherwise, training the network, without using the old information, but only new data, would result in a catastrophic forgetting of previous knowledge [20]. The method adopted in this paper for the weight updating is

similar to that proposed in [20]. In particular, the non-linear function of equation (1) is linearized using a first order Taylor series expansion. Then, the effect of the previous network knowledge is expressed by the error sensitivity over all network weights. Activation of the weight adaptation mechanism is based on the evaluation of the prediction error during a real time video transmission. More specifically, in case that the prediction error is greater than the pre-determined threshold, the weight adaptation mechanism is activated. Otherwise, the same neural network weights are used for future sample prediction. Selection of threshold is accomplished by the relative prediction error of the test set, which has been estimated during the network training phase. In our experiment if the prediction error exceeds 10% of the test set error, the weight updating is activated. B. Off-line Modeling In case of off-line traffic modeling, the same neural network architecture is used. The only difference, in this case, is that the network inputs are the estimated instead of the actual data since the latter are not available. Thus equation (1) can be written as (3) xˆ c (n) = g c ( xˆ c (n − 1),..., xˆ c (n − p c )) + r c (n), The error r c (n) is an i.i.d. process which presents the same statistics as the error e c (n) of (1), i.e., it has the same mean value and standard deviation as e c (n) . As can be seen from equation (3), the neural network operates in an autonomous mode or equivalently in a closed loop operation. To start the recursive operation, only pc initial samples are required [equation (3)]. One common choice, for the pc initials, is to be randomly selected as a sequence of pc consecutive samples belonging to the training set. However, appropriate modeling of error r c (n) is required so that the network model can accurately estimate the frame rates. Using several experiments of VBR MPEG-2 coded video sequences it can be shown that a Gaussian distribution provides an accurate approximation of the error pdf. Similar results have been obtained for other VBR video streams, which have been coded using, however, different compression schemes [3]. Then, the aggregate sequence is generated by deterministically I, P and B frames according to the GOP pattern. However, if uncorrelated errors r c (n) are used as filter inputs to signal xˆc (n) [equation (3)], the aggregate MPEG-2 sequence will contain uncorrelated I, P and B components, since xˆc (n) are generated independently. Instead the real data contains correlated components of I, P and B frames. Such independent generation of I, P and B frames results in significant underestimate of the network resources, though the models follow the statistics for each c-stream, since they cannot capture the burstness of the

actual video traffic [11]. One way to correlate xˆc (n) , for different c, is to correlate their respective errors, due to the fact that they follow the same pdf. Particularly, a reference error is generated, following the Gauss distribution with zero mean and variance equal to one. Then, the errors of I, P and B frames are generated with respect to this error. A simple approach is to consider as reference error, the ε B (n) due to the fact that B frames constitute the majority within a GOP.

along with those obtained by varying the rate of the real video data +/-1% when 20 multiplexed video sources are fed in the buffer. Two different utilization degrees have been examined in this figure (U=0.75 and U=0.8). As observed, the cell losses provided by the model lie within the +-1% uncertainty of the real data. V. REFERENCES [1] [2]

IV. EXPERIMENTAL RESULTS

[3]

To evaluate the good performance of the proposed neural network structure both for on-line and off-line traffic modeling, we use data of a real life video source, coded by the MPEG-2 optibase card. Figure 3 presents the actual and the predicted average rates over a GOP period for a window of 500 frames. The averaging is performed to indicate the model accuracy as far as the video activity is concerned. It can be seen that the proposed model yields very satisfactory results since it can predict with high accuracy the frame rates in all cases.

[4]

[5]

[6]

[7]

18 Actual Data

16

Predicted Data

[8]

Bit Rate (Mbits/s)

14 12

[9] 10 8

[10]

6 4

[11]

2000

2100

2200 2300 Frame Number

2400

2500

[12]

Figure 3. Video activity prediction over a time window of 500 frames.

[13] [14]

0

0

10

10

Data +/-1%

The Proposed Model

-2

10

Cell Loss Probability

Cell Loss Probability

Data +/-1%

-4

[15]

-4

10

10

[16]

-6

-6

10

The Proposed Model

-2

10

5

10

15

20

25

Buffer Size [ms]

30

35

10

0

20

40

60

80

100

Buffer Size [ms]

(a) (b) Figure 4. Cell loss probability using real data and the proposed model. (a) Utilization 0.75, (b) utilization 0.8. In case of off-line traffic modeling, we examine the performance of the model as good estimator of the network resources. In this framework, a buffer configuration scheme is examined based on a First In First out (FIFO) policy as in [3]. Figure 4 depicts the cell loss probability obtained from the proposed traffic model (dotted line), versus buffer size,

[17]

[18]

[19] [20]

M. de Prycker, Asynchronous Transfer Mode. Solution for Broadband ISDN. Alcatel Bell, Antwerp Belgium, 1993. N. Ohta, Packet Video, Modeling and Signal Processing. Artech House Boston-London, 1994. B. Maglaris, D. Anastassiou, P. Sen, G. Karlsson and J. D. Robbins, “Performance Models of Statistical Multiplexing in Packet Video Communication,” IEEE Trans. on Comm., Vol. 36, pp. 834-843, 1988. M. R. Frater, J. F. Arnold, and P. Tan, “A New Statistical Model for Traffic Generated by VBR Coders for Television on the Broadband ISDN,” IEEE Trans. on Circuits. and Syst. for Video Technol., vol. 4, pp. 521-526, 1994. D. Heyman, A. Tabatabai and T. V. Lakshman, “Statistical Analysis and Simulation Study of Video Teleconference Traffic in ATM Networks,” IEEE Trans. on Circuits. and Syst. for Video Techn., vol. 2, pp. 49-59, 1992. F. Yegenoglu, B. Jabbari and Ya-Qin Zhang, “Motion-Classified Autoregressive Modeling of Variable Bit Rate Video,” IEEE Trans. on Circuits and Syst. for Video Technol., vol. 3, pp. 42-53, 1993. C.J. Hughes, M. Ghanbari, D.E. Pearson, V. Sefridis and J. Xiong, "Modeling and Subjective Assement of Cell Discard in ATM Networks," IEEE Trans. on Image Processing vol. 2, pp. 212-222, April 1993. D. Heyman and T. B. Lakshman, “Sources Models for VBR Broadcast Video Traffic” IEEE/ACM Trans. on Networking, vol. 4, pp. 40-48, 1996. O. Rose, “Statistical Properties of MPEG Video Traffic and their Impact on Traffic Modeling in ATM Systems,” Proc. of the 20th Conf. on Local Computer Networks, Minneapolis Oct. 16-19, 1995. M. R. Grasse and J. F. Frater, and Arnold, “Traffic Characteristics of MPEG-2 Variable Bit Rate Video,” Australian Telecommunication Networks and & Application Conference, Dec. 1994. A. D. Doulamis, N. D. Doulamis, G. E. Konstantoulakis and G. I. Stassinopoulos, “Traffic Characterisation and Modelling of VBR Coded MPEG Sources,” IFIP ATM Netw., vol. 3, pp. 60-80, 1997. D. P. Heyman, A. Tabatabai, and T. V. Laskman, “Statistical Analysis of MPEG-2 Coded VBR Video Traffic,” Packet Video’94 A.M. Dawood and M. Chanbari, "Content-Based MPEG Video Traffic Modeling," IEEE Trans. on Multim. vol. 1, pp. 77-87, 1999. N. Doulamis, A. Doulamis, G. Konstantoulakis and G. Stasinopoulos, "Efficient Modeling of VBR MPEG-1 Video Sources," IEEE Trans on Circuits and Systems for Video Technology, vol. 10, no. 1, Feb. 2000. J. Chong, S.Q. Li and J. Ghosh, “Predictive Dynamic Bandwidth Allocation for Efficient Transport of Real Time VBR over ATM,” IEEE Select. Areas in Comm., vol. 13, pp. 12-33, Jan. 1995. P.-R. Chang and J.-T. Hu, “Optimal Nonlinear Adaptive Prediction and Modeling of MPEG Video in ATM Networks Using Pipelined Recurrent Neural Networks,” IEEE journal of Selected Areas in Comm., vol. 15, no. 6, Aug. 1997. J. Connor, D. Martin and L. Altas, “Recurrent Neural Networks and Robust Time Series Prediction,” IEEE Trans. on Neural Networks, vol. 5, no. 2, pp. 240-254. S. Kollias and D. Anastassiou, “An Adaptive Least Squares Algorithm for the Efficient Training of Artificial Neural Networks,” IEEE Trans. on Circuits and Systems, vol. 36, pp. 1092-1101,1989. S. Haykin. Neural Networks: A Comprehensive Foundation. New York: Macmillan, 1994. A. Doulamis, N. Doulamis and S. Kollias, “On Line Retrainable Neural Networks: Improving the Performance of Neural Networks in Image Analysis Problems,” IEEE Trans. on Neural Networks, accepted for publication to appear in the early of 2000.