Stochastic Load Forecasting Using Autoregressive ...

Short-Term (STLF) Stochastic Load Forecasting Using Autoregressive Integrated Moving Average (ARIMA) Models and Hidden Markov Model Jeffrel P. Hermias

Kardi Teknomo

Jose Claro N. Monje

Electronics, Communications and Computer Engineering Department School of Science and Engineering Ateneo de Manila University Katipunan Ave., Loyola Heights, Quezon City, Philippines [email protected]

Information Systems & Computer Science Department School of Science and Engineering Ateneo de Manila University Katipunan Ave., Loyola Heights, Quezon City, Philippines [email protected]

Electronics, Communications and Computer Engineering Department School of Science and Engineering Ateneo de Manila University Katipunan Ave., Loyola Heights, Quezon City, Philippines [email protected]

Abstract—Load forecasting, particularly short-term load forecasting (STLF) plays a vital role in the economy streaming and tracking of power system. Many stochastic and artificial intelligence techniques haven been used in order to come up with an accurate (less error) short-term load forecast. Here, we introduce a new approach to short-term load forecasting (STLF) using the conventional Hidden Markov Model (HMM) then compare it with Autoregressive Integrated Moving Average (ARIMA) models. Three-dimensional continuous multivariate Gaussian emission probabilities are used in this experiment for HMM. Meanwhile for ARIMA models, different parameters are used for different kinds of dataset. Comparison is done afterwards to the actual load value using MAPE and RMSE. Keywords—Hidden Markov Model; HMM; Short-term load forecasting; STLF; ARIMA; Autoregressive Integrated Moving Average; Baum-Welch Algorithm; Machine learning; Artificial Intelligence.

I. INTRODUCTION Short-term load forecasting (STLF) plays a major role of power system’s day-to-day activities. In order to prevent utility difficulties, electric companies should have adequate power in order to sustain their customers’ electricity needs. If not forecasted properly, power outage may arise which could produce various repercussions such as appliance damages and customer dissatisfaction. Over the years, various techniques and methods have been used hoping to produce a certain load forecast that is more accurate and efficient. Different statistical approaches such as Linear Regression models [1], Autoregressive and Moving Average (ARMA) models [2], exponential smoothing models [3] and more have been used to yield a better forecast. Various artificial intelligence and machine learning techniques too have sprouted recently in order to come up with an output hoping that it’ll give a high-precision STLF such as Artificial Neural

Network (ANN) [4], Fuzzy Logic [5], Support Vector Machine [6], or even hybrid versions in any of the aforementioned techniques [7]-[8]. This study would like to investigate a new and novel approach in the field of short-term load forecasting using the conventional Hidden Markov Model (HMM) with 3dimensional multivariate Gaussian emission probabilities for the STLF model and at the same use the Autoregressive Integrated Moving Average (ARIMA) models for model comparison. The Baum-Welch algorithm is used to train the HMM model and the Viterbi Algorithm will be used to identify the possible load forecast state and then use the Bayesian update to create a forecast. The Baum-Welch algorithm already justifies the convergence of the HMM model as cited by Rabiner et al. in [9]. This study is organized into different sections. Introduction is for Section I followed by the theoretical structure of Hidden Markov Model (HMM) in Section II and III. The HMM-based Short-Term Load Forecasting (STLF) model will be discussed in Section III enumerating the various emission probabilities to be used, the reestimation of the HMM parameters and the load forecasting calculation in VSTLF and STLF. II. STLF-BASED HIDDEN MARKOV MODEL (HMM) AND ARIMA MODEL As far as the researchers are concerned, currently there are no available HMM-based short-term load forecasting models. So this study would like to create a new approach on STLF by incorporating HMM theoretical concept in STLF and at the same time use ARIMA model in order to compare forecasting output. The term emission probability is used interchangeably with observation probability to denote the same entity for the HMM parameter.

A. HMM Description A Hidden Markov Model (HMM) is a doubly embedded stochastic process with an underlying stochastic process that is not observable (it is hidden), but can only be observed through another set of stochastic processes that produce the sequence of observations [9] as shown in Fig. 1.

S1

S2

…

ST

O1

O2

…

OT

Hidden States

Problem 3: How do we adjust the model parameters ( ) to maximize ( )? As cited in Rabiner’s paper, these problems can be solved. Problem 1 can be solve using the Forward part of the ForwardBackward Algorithm while Problem 2 can be solved by using the Viterbi Algorithm and Problem 3 is the ExpectationMaximization part of the model known to be the Baum-Welch Algorithm. B. Forecasting the next observation through HMM Using HMM as a predictor isn’t that straightforward. Given the observation sequence up to a certain time , one can compute the forecasted hidden state using the Viterbi algorithm after performing the Baum-Welch algorithm to train the model. Using Bayesian update after forecasting the next state, one can forecast the next observation as shown in (1). (

Observations Fig. 1. A doubly embedded stochastic process—Hidden Markov Model.

A Hidden Markov Model comprises of the following parameters: 1. A set of observations or output denoted as where is a vector matrix of all sets of observations where is the total number of observations. 2. A set of possible hidden states denoted as where is a vector matrix of all hidden states possibilities where is the total number of states. 3. An initial state probability distribution where ( ) for which sets the initial probability of state at . 4. The state transition probability distribution matrix ( ) for { } where . This is the probability distribution of transitioning from the current state to the next state . 5. The observation symbol probability distribution in state , { ( )}, where ( ) ( at ) for and . This is the probability distribution of the observation at a particular time with respect to its current state . Given the form of HMM mentioned above, there are three basic problems of interest that must be solved for the model to be useful in real-world applications according to Rabiner [9]. These problems are the following: Problem 1: Given

the observations sequence ( ), how , and a model ( ), the do we efficiently compute probability of the observation sequence, given the model? Problem 2: Given the observation sequence , and a model , how do we choose a corresponding state sequence which is the optimal in some meaningful sense (i.e., best “explains” the observations)?

)

∑

(

)

(1)

To further simplify (1), one could streamline the equation to (2) using the Bayesian update in the context of an event C for events A, B, and C as shown: (

)

( )

∑

(

)∑

(

) ( ) (2)

Given the three problems of HMM and the energy demand daily observations, the researchers created an HMM-based STLF model. The theoretical and conceptual framework for an STLF-based HMM is showin in Fig. 2. 0 Multivariate Gaussian (3D) observation transformation

Daily Demand Observation

HMM-based STLF Model

Viterbi Decoding

Training Process

Bayesian update Forecasting Fig. 2. STLF-based Hidden Markov Model.

1. Daily Observation Transformation

and

Multivariate

Gaussian

Observations gathered are on daily basis for the year 2015 in the Philippine Wholesale Electricity Spot Market (WESM) for the three main lines: Whole WESM System, Luzon and Visayas. Electricity demand data were gathered and considered as the main point of interest for observation forecasting. At the same time, electricity prices corresponding to that particular electricity demand data were also considered in order to transform these observations to a 3-dimensional Multivariate Gaussian probability density distribution. Thus, making this study’s observation transformed from discrete to continuous. A general multivariate Gaussian distribution is expressed as in (3) where is the dimension of the density function, is the

mean vector and is a symmetric covariance matrix. For this study, multivariate Gaussian probability density function is a three-dimensional with having dimensions as the electricity demand and electricity price on a daily basis. The threedimensional probability density function plot having the said dimensions is shown in Fig. 3. (

(

) (

(

)

(

))

(3)

)

series forecasting, and provide complementary approaches to a particular problem. This statistical method aims to describe the autocorrelations in the data [10]. This method is mathematically expressed in (4) which are a discrete time linear equations with noise . (

∑

)(

)

(

∑

)

(4)

To note, ARIMA models is composed of the Autoregressive (AR) part, the Integration (I) measure, and the Moving Average (MA) part with orders , , for the trend component and with orders , , for the seasonality component. Being said so, ARIMA model-based STLF is chosen as another model to make comparison with the HMMbased STLF. Noting its usefulness to time-series research, this technique serves as the benchmark or reference for assessment and evaluation with that of the HMM-based STLF. Different ARIMA models with corresponding parameters are chosen for each dataset—WESM System, Luzon, and Visayas. III. RESULTS AND DISCUSSION 1. Training Process and HMM-based STLF model

Fig. 3. Three-dimensional Gaussian probability density function of the electricity demand and price.

2.

Training Process and HMM-based STLF model

Other HMM parameters were set in order to produce an STLF-based Hidden Markov Model with state number primarily considered to be 3 ( ) and the initial distribution chosen to be random. Training is done by computation through the Baum-Welch Algorithm and Forward-Backward Algorithm. Hence, answering Problems 3 and 1 respectively for HMM with the observation to be the 3-dimensional Multivariate Gaussian having the electricity demand and electricity price as and dimensions respectively. 3.

Viterbi Decoding and Bayesian update Forecasting

In order to track the state at observation time , one can use the Problem 2’s answer after training the model. Thus, the ̅ ̅ ̅ with estimated parameters will now then become ̅ the best state sequence for all observations. Viterbi algorithm, which is based on dynamic programming, is mainly used for the decoding process in order to track such best state sequence. Once finding the best state sequence is done, forecasting the next possible state will then be the next step in order to narrow down possibilities for load forecasting. That is, it is essential to solve for the probability of the next state given all the current observations which is ). This can be found upon simplifying denoted as ( (1) using Bayes’ rule. Then in order to forecast the next observation, Bayesian update will then be used like the one in (2) for identifying the most probable observation which falls under the most probable state under the Viterbi dynamic programming algorithm. 4.

ARIMA Models for STLF

Autoregressive Integrated Moving Average (or ARIMA) models provide another approach to time series forecasting. They are one of the most widely-used approaches to time-

As mentioned in Section II, training process is done in order to arrive with the estimated parameters of the Hidden Markov Model using the three datasets. The full training algorithm used in HMM-based STLF is shown in Table 1. TABLE I.

THE HMM-BASED STLF MODEL TRAINING ALGORITHM

HMM-based STLF Model The training algorithm of the HMM-based Short-Term Load Forecasting (STLF) Model 1: Get the value of load data (observation) 2: Transform discrete observation to 3-dimensional Gaussian probability density function 3: Set , ( ) and scale factor . While or 4: Initialize 5: Scale the initialized value 6: Forward induction: for () Compute and store () 7: Scale the stored value ( ) 8: Initialize at 9: Backward induction: for () Compute () 10: Scale the stored value 11: Gamma-Xi calculation procedure: for Compute the denominator value of and Compute and Compute for all 12: Calculate the log probability value 13: Baum-Welch (reestimation) procedure: Compute ̅ for all values of ( ) Compute ̅ for all values of ⁄ Compute ̅ ( ) for all values of ( )⁄ 14: Set end While

On the other hand, it was mentioned also in Section II that forecasting isn’t pretty straightforward when it comes to forecasting the next observation. Table 2 shows the full prediction algorithm using HMM. TABLE II.

THE HMM-BASED STLF MODEL PREDICTION ALGORITHM

HMM-based STLF Model The prediction algorithm of the HMM-based Short-Term Load Forecasting (STLF) Model ) 1: Get the posterior probability of ( 2: Acquire the most probable state prediction of using the Viterbi algorithm ) using the most 3: Locate the most probable emission ( probable state prediction and the estimated value ̅ . 4: Perform marginalization (all state possibilities) of the ( ), state product of the posterior probability prediction and the estimated and to obtain the forecasted value. As for the data and as mentioned in Section II, there were three datasets used for this particular study with all demands coming from the Philippine Wholesale Electricity Spot Market (WESM). One dataset is the daily demand values (in MW) of the WESM System as shown in Fig. 4. Another dataset shown in Fig. 5 is the daily demand values (in MW) of the electricity demand of the Philippine’s biggest island group—Luzon. And lastly, the third dataset is the daily demand values (in MW) of the electricity demand of the Philippine’s second biggest island group—Visayas which is shown in Fig. 6.

Fig. 6. Visayas Demand Data for the year 2015.

The validation process performed for HMM-based STLF is the 10-fold cross validation. For -fold cross validation dealing with time series data, forward chaining is done to ensure that the data would still follow the time sequential rule. The number of states is set to for all runs and the algorithm (Table I and II) is then followed to produce a certain forecast output. Fig. 7, Fig. 8, and Fig. 9 shows the actual demand (blue) and forecasted demand (red) using the 10-fold forward chaining cross validation for WESM System, Luzon, and Visayas respectively. Error plot for WESM System is shown in Fig. 10 with mean error (for all actual vs. forecasted values) of 5.92%. Meanwhile Fig. 11 shows the error plot for Luzon with mean error (for all actual vs. forecasted values) of 7.33%.

Fig. 4. WESM System Demand Data for the year 2015.

Fig. 7. WESM System (Dataset #1) Actual Demand (blue) vs. Forecast Demand (red) for the year 2015.

Fig. 5. Luzon Demand Data for the year 2015.

Fig. 8. Luzon (Dataset #2) Actual Demand (blue) vs. Forecast Demand (red) for the year 2015.

th cross validation location of the data. To standardize result output for -fold cross validation, one can use (5) and (6) to compute for MAPE and RMSE respectively for -fold cross validation. (5) (6) Table III shows the error values (in MAPE, MAE, RMSE and MSE) of the three datasets for the HMM-based STLF. TABLE III. Fig. 9. Visayas (Dataset #3) Actual Demand (blue) vs. Forecast Demand (red) for the year 2015.

THE HMM-BASED STLF MODEL ERROR TABLE

Dataset

MAPE

MAE

RMSE

MSE

WESM System

5.92%

454

579

342941

Luzon

7.36%

493

575

356834

Visayas

7.88%

94

112

13005

2. ARIMA-based STLF

Fig. 10. Error plot for WESM System with mean error of 5.92%.

Identification stage is the first thing to do in ARIMA modeling in order to know whether your data is stationary or not. Statistical stationarity should be identified mainly in order to proceed with ARIMA estimation and forecasting. One way to check stationarity is to use a unit root test such as the Augmented Dickey-Fuller (ADF) test to ensure what order of differencing the data could be stationary. Table IV shows the ADF test -value summary of the three electricity demand datasets wherein the 1st order differencing makes the dataset stationary and thus, this will be used as the parameter for the Integration (I) part of ARIMA modeling. TABLE IV.

Electricity Demand Dataset

AUGMENTED DICKEY-FULLER TEST FOR 3 DATASETS

Augmented Dickey-Fuller (ADF) Test Original

1st order

(no differencing)

differencing

WESM System

0.4838

0.01

Luzon

0.4525

0.01

Visayas

0.6618

0.01

Fig. 11. Error plot for Luzon with mean error of 7.33%.

To find the values of the other parameters like and (for the trend) and and (for the seasonality), one can use the autocorrelation function (ACF) and the partial autocorrelation function (PACF) to estimate the MA and the AR part of the ARIMA model respectively. Fig. 12. Error plot for Visayas with mean error of 7.81%.

On the other hand, Fig. 12 shows the error plot for Visayas with mean error (for all actual vs. forecasted values) of 7.81%. The aforementioned error values are mean errors for all forecasted values versus the actual demand regardless of the

Table V shows the parameter estimation for the three datasets with their corresponding AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) values. The parameters having the lowest AIC value (the ones in highlight or bolded ones) are chosen to represent that certain dataset for ARIMA modeling and forecasting.

TABLE V.

ARIMA MODELING PARAMETER ESTIMATION

Parameter Estimation for ARIMA-based STLF Dataset

Sigma squared

AIC

BIC

615012

47005

11.823

10.962

715111

46665

11.856

11.046

615112

46715

11.857

11.047

616111

46793

11.859

11.048

517012

46856

11.861

11.050

517111

39641

11.693

10.882

617111

39543

11.698

10.901

517211

39618

11.700

10.903

413011

41869

11.703

10.811

211011

40045

11.703

10.757

215112

1135

8.110

7.245

Fig. 15. The 7th fold forward-chaining cross validation for Visayas.

617011

1104

8.112

7.301

211112

1174

8.114

7.195

Numerical error results (in MAPE, MAE, RMSE and MSE) of the three datasets for the entire ARIMA modeling is shown in Table VI.

216011

1149

8.114

7.236

215011

1157

8.114

7.222

Model ( , , , , , )

WESM System

Luzon

Visayas

The 10-fold forward-chaining cross validation, same with HMM, is the validation technique used with ARIMA modeling. Fig. 13, Fig. 14, and Fig. 15 shows the a particular th fold forward-chaining cross validation in ARIMA modeling for WESM System, Luzon, and Visayas respectively with the red lines being the forecasted values and the blue lines as the frequency (forecast intervals) values.

Fig. 14. The 3rd fold forward-chaining cross validation for Luzon.

TABLE VI.

THE ARIMA-BASED STLF MODEL ERROR TABLE

Dataset

MAPE

MAE

RMSE

MSE

WESM System

7.29%

571

661

535719

Luzon

7.81%

528

607

442368

Visayas

5.16%

61

71

5737

Comparing Table III and VI, we can say that HMM-based STLF model performs a better for WESM System and Luzon dataset while ARIMA-based STLF model performs higher than HMM for Visayas dataset. IV. CONCLUSION

Fig. 13. The 10th fold forward-chaining cross validation for WESM System.

It can be shown that forecasting with HMM and in comparison with ARIMA models can actually work with Short-Term Load Forecasting with acceptable errors values and this paper would like to propose another avenue in STLF research in order to improve the accuracy particularly in the next-day ahead load forecasting for STLF. The empirical results achieved in this study shows that the model could provide a viable and practical forecasting technique which could support to the advancement of Short-Term Load Forecasting research.

REFERENCES [1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10] [11] [12]

[13]

[14]

[15]

[16]

[17]

[18]

T. Hong, M. Gui, M. E. Baran and H. L. Willis, "Modeling and forecasting hourly electric load by multiple linear regression with interactions," IEEE PES General Meeting, Minneapolis, MN, 2010, pp. 1-8. Shyh-Jier Huang and Kuang-Rong Shih, "Short-term load forecasting via ARMA model identification including non-Gaussian process considerations," in IEEE Transactions on Power Systems, vol. 18, no. 2, pp. 673-679, May 2003. P. Medina Macaira, R. Castro Sousa and F. L. Cyrino Oliveira, "Forecasting Brazil's electricity consumption with Pegels Exponential Smoothing Techniques," in IEEE Latin America Transactions, vol. 14, no. 3, pp. 1252-1258, March 2016. P. Ray, D. P. Mishra and R. K. Lenka, "Short term load forecasting by artificial neural network," 2016 International Conference on Next Generation Intelligent Systems (ICNGIS), Kottayam, 2016, pp. 1-6. M. F. I. Khamis, Z. Baharudin, N. H. Hamid, M. F. Abdullah and F. T. Nordin, "Short term load forecasting for small scale power system using fuzzy logic," 2011 Fourth International Conference on Modeling, Simulation and Applied Optimization, Kuala Lumpur, 2011. Ning Ye, Yong Liu and Yong Wang, "Short-term power load forecasting based on SVM," World Automation Congress 2012, Puerto Vallarta, Mexico, 2012, pp. 47-51. B. ul Islam, Z. Baharudin, P. Nallagownden and M. Q. Raza, "A hybrid neuro-genetic approach for STLF: A comparative analysis of model parameter variations," 2014 IEEE 8th International Power Engineering and Optimization Conference (PEOCO2014), Langkawi, 2014, pp. 526531. P. Ray, S. Sen and A. K. Barisal, "Hybrid methodology for short-term load forecasting," 2014 IEEE International Conference on Power Electronics, Drives and Energy Systems (PEDES), Mumbai, 2014, pp. 16. L. R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," in Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, Feb 1989. R. Hydman and G. Athanasopoulos. Forecasting: principles and practice. OTexts. 2017. Ethem Alpaydin. Introduction to Machine Learning. The MIT Press, Cambridge, Massachusetts, London, England, pp. 306-307, 309. G. Chicco, R. Napoli, and F. Piglione, “Comparisons among clustering techniques for electricity customer classification,” IEEE Transmission Power System, vol. 21, no. 2, pp. 933–940, May 2006. T. Zhang, G. Zhang, J. Lu, X. Feng, and W. Yang, “A new index and classification approach for load pattern analysis of large electricity customers,” IEEE Transmission Power System, vol. 27, no. 1, pp. 153– 160, Feb. 2012. G. Tsekouras, N. Hatziargyriou, and E. Dialynas, “Two-stage pattern recognition of load curves for classification of electricity customers,” IEEE Transmission Power System, vol. 22, no. 3, pp. 1120–1128, Aug. 2007. V. Figueiredo, F. Rodrigues, Z. Vale, and J. Gouveia, “An electric energy consumer characterization framework based on data mining techniques,” IEEE Transmission Power System, vol. 20, no. 2, pp. 596– 602, May 2005. M. Espinoza, C. Joye, R. Belmans, and B. De Moor, “Short-term load forecasting, profile identification, and customer segmentation: A methodology based on periodic time series,” IEEE Transmission Power System, vol. 20, no. 3, pp. 1622–1630, Aug. 2005. S. Verdu, M. Garcia, C. Senabre, A. Marin, and F. Franco, “Classification, filtering, and identification of electrical customer load patterns through the use of self-organizing maps,” IEEE Transmission Power System, vol. 21, no. 4, pp. 1672–1682, Nov. 2006. S. Valero, M. Ortiz, C. Senabre, C. Alvarez, F. Franco, and A. Gabaldon, “Methods for customer and demand response policies selection in new electricity markets,” IET General, Transmission Distribution, vol. 1, no. 1, pp. 104–110, 2007.

[19] G. Coke and M. Tsao, “Random effects mixture models for clustering electrical load series,” J. Time Series Analysis, vol. 31, no. 6, pp. 451– 464, 2010. [20] G. W. Labeeuw and G. Deconinck, "Residential Electrical Load Model Based on Mixture Model Clustering and Markov Models," in IEEE Transactions on Industrial Informatics, vol. 9, no. 3, pp. 1561-1569, Aug. 2013. [21] F. Flandoli, “ARIMA Models,” Universita Di Pisa, Department of Mathematics. 2015. [22] SAS Institute Inc. 2014. SAS/ETS® 13.2 User’s Guide. Cary, NC: SAS Institute Inc.