Modeling and Forecasting Passenger Car Ownership ...

10 downloads 0 Views 3MB Size Report
Jul 2, 2018 - for passenger car ownership in six representative countries ... which are represented by the patterns of Japan, USA and Australia, respectively.
sustainability Article

Modeling and Forecasting Passenger Car Ownership Based on Symbolic Regression Lian Lian *

ID

, Wen Tian, Hongfeng Xu and Menglan Zheng

School of Transportation and Logistics, Dalian University of Technology, Dalian 116024, China; [email protected] (W.T.); [email protected] (H.X.); [email protected] (M.Z.) * Correspondence: [email protected]  

Received: 13 May 2018; Accepted: 26 June 2018; Published: 2 July 2018

Abstract: Numerous functions, especially the Gompertz function, have been predetermined to analyze the growth in vehicle ownership. This study utilizes the data-driven symbolic regression to automatically find a generalized function, named as new equation by symbolic regression (NE-SR), for passenger car ownership in six representative countries including Japan, England, USA, Finland, Poland and Australia. Then the new proposed function is applied for forecasting the passenger car ownership in China up to the year 2060. The experimental results indicate that the NE-SR, as an extension of the Gompertz function, fits better than the classical Gompertz function for car ownership growth. In NE-SR function, three scenarios can be realized by the variation of parameter signs, which are represented by the patterns of Japan, USA and Australia, respectively. The predicted results based on the NE-SR also show that the Chinese car ownership still has a potential to increase after 2060 in the pattern of Japan and Australia, but grows until around 2057 in the pattern of USA. The results can be used to further predict the energy demand and carbon emissions of passenger cars, which can provide a basis for the policymaker to propose transportation and environmental strategies. Keywords: vehicle ownership; Gompertz function; per capita GDP; symbolic regression

1. Introduction The worldwide increase in urban mobility since the 1960s has directly resulted in increasing motor vehicles, especially in many low-income populous countries, such as China and India [1]. The tremendous growth of vehicle ownership has caused a series of problems, e.g., the increase of oil consumption, air pollution emissions, severe traffic congestion and the lack of parking space, etc. Moreover, car ownership is an important variable in car travel behavior research [2]. Therefore, it is important for academic researchers, environmentalists and policymakers to accurately forecast the development trend of vehicle ownership. Vehicle ownership modeling has been widely researched. The models developed during 1995–2002 were reviewed and classified into nine categories in [3], which can be further divided into aggregate and disaggregate models according to data type. In the aggregate models, the ownership level of various vehicles, e.g., cars [4] and hybrid electric vehicles [5], can be analyzed on the basis of product life cycle and diffusion model which contains several sigmoid-shaped functions, e.g., the logistic, the Richards and Gompertz function [6,7]. Furthermore, the Gompertz function has been found to best fit the historical vehicle ownership data among these three functions [8], and a variety of researches studied the environment and transportation policy by assuming the growth of vehicle ownership as Gompertz function. For example, the future vehicle energy demand and greenhouse gas (GHG) were estimated depending on Gompertz function [9,10]. The effects of the two license quota policies on car ownership levels were compared and the delays in the process of personal motorization in Shanghai and Beijing were examined in [11]. However, is there any Sustainability 2018, 10, 2275; doi:10.3390/su10072275

www.mdpi.com/journal/sustainability

Sustainability 2018, 10, 2275

2 of 16

other function fitting better than Gompertz function to describe the relationship between economic factor and car ownership? This is an interesting problem worth in-depth investigation. In fact, several improved Gompertz functions have already been proposed for better forecasting vehicle ownership growth [12–14], where the corresponding parameters were estimated by statistics-based regression methods. The traditional statistics-based regression methods need to assume a predetermined form of function according to the experience and knowledge. Then, the parameters of the proposed function are estimated by non-linear least squares method, maximum likelihood method, etc. These methods generally have solid and widely accepted mathematical foundations and can provide more insight in relationship among variables. However, the experience or knowledge is sometimes limited in certain research field. It is difficult for the traditional regression to determine the most approximate function model for a given data set. Moreover, the mathematical functions are built on strong assumptions which are sometimes not practically relevant to the real world. Different with the traditional regression methods, symbolic regression (SR) can automatically establish suitable model of the numeric data set without the assumption of function forms. It is a data-driven method based on the extended genetic programming (GP), proposed by Cramer in 1985 [15] and developed by Koza [16]. It has been successfully utilized to define the hidden relationships in many fields. For instance, SR was demonstrated on four simulated and two real systems spanning mechanics, ecology and system biology [17]. Motion-tracking data was searched from various physical systems, and Hamiltonians, Lagrangians, and other laws of geometric and momentum conservation were discovered [18]. Hubbert theory in oil production was modeled as Guassian distribution [19]. An accurate traffic speed prediction was built to generate significant information for travellers [17]. To our best knowledge, the work in [17] is the first to use this method in the field of transportation. The Chinese automotive market has greatly grown over the past two and a half decades and the number of vehicles is expected to dramatically increase further. The medium- and long-term development plan of automobile industry issued by Ministry of Industry and Information Technology of the People’s Republic of China in 2017 forecasted that auto production will reach 30 million in 2020 and 35 million in 2050. Therefore, it is necessary to analyze and forecast the passenger car ownership in China. Different from the previous researches which assumes the relationship between vehicle ownership and economic factors as an S-shape function, this study automatically establishes the relation between the passenger car ownership and the gross domestic product (GDP) per capital by the data-driven method, SR. The newfound relation includes the Gompertz function as a special case and fits better than the traditional Gompertz function in the six selected countries whose automotive industry has entered the saturated period. The remainder of this paper is organized as follows. The SR method and the traditional Gompertz function are briefly introduced in Section 2. The data sources are then presented in Section 3. Section 4 examines our approach on the synthetic data, proposes a novel vehicle ownership function for six representative countries and then applies the proposed function to predict and analyze the car ownership in China. Conclusions are finally drawn in Section 5. 2. Methodology 2.1. Symbolic Regression The procedures of SR via GP mainly include four steps and the pseudo code is described in Algorithm 1. (1)

Step 1: Population initialization.

The typical representation of individual in SR is a parse tree, which generally has two types of nodes. They are leaf nodes and internal nodes. The leaf nodes consist of the terminal symbols, such as

Sustainability 2018, 10, 2275

3 of 16

Sustainability 2018, 10, x FOR PEER REVIEW

3 of 16

decision variables, constants or other problem parameters, and the internal nodes represent arithmetic functions, e.g., +, −, ×, ÷, e, ln, etc. Figure 1 shows an example of the tree structure in SR individual arithmetic functions, e.g., +, −, ×, ÷, e, ln, etc. Figure 1 shows an example of the tree structure in SR for the equation 1 + (x × y). Once the Function Set (FS) for internal nodes, the Symbol Set (SS) for leaf individual for the equation 1 + (x × y). Once the Function Set (FS) for internal nodes, the Symbol Set nodes, the Maximum Depth of Tree (DT) and Population M are determined, initial population of M (SS) for leaf nodes, the Maximum Depth of Tree (DT) and Population M are determined, initial trees are then randomly generated with FS, SS and DT. population of M trees are then randomly generated with FS, SS and DT.

+

1

×

x

y

Figure 1. An example of the tree structure in genetic programming (GP) individual. Figure 1. An example of the tree structure in genetic programming (GP) individual.

Algorithm 1. Pseudo code of symbolic regression.

Algorithm 1. Pseudo code of symbolic regression. Input: set FS, SS, DT, TC, G, M, Pr, Pc, Pm Input: set FS, Best SS, DT, TC, G, M, Pr, Pc, Pm Output: expression Output: expression 1. Best Generate initial population with FS, SS and M, and set gen = 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.

2.Generate While gen =0, αinstead (2) Scenarioclose 2. When of reaching the saturation level, the car ownership ratio will Scenario 2. When θ 0 >grows 0, instead of reaching saturation the car ownership continue to slowly with the increase ofthe per-capita GDPlevel, and infinitely approach to ratio the will continue to slowly grows the increase per-capita GDP infinitely function = ′∙ex ( ′ ∙ with ) in the third period of of car ownership. It isand reasonable that approach people will to the 0 ·exp continue cars grows, which further raises car ownership ratio. will function y =to αbuy (θ 0as · x)the inper-capita the third GDP period of car ownership. It is the reasonable that people It is supported by [14], which also stated thatgrows, vehiclewhich ownership slowly grows the growth ratio. continue to buy cars as the per-capita GDP further raises theafter car ownership has reached its saturation level. Notably, the growth of car ownership ratio is limited It is rate supported by [14], which also stated that vehicle ownership slowly grows after the growth because per-capita GDP cannot grow forever. The ownership in Japan, England, Finland and rate has reached its saturation level. Notably, the growth of car ownership ratio is limited because Poland are the examples for this scenario. per-capita GDP cannot The ownership in Japan, England, and Poland are (3) Scenario 3. When ′