Obesity Prediction Using Ensemble Machine

0 downloads 0 Views 218KB Size Report
every context (such as a pregnant lady or an old man) and yet provide accurate results. To this end, we employ the R ensemble prediction model and Python.
Obesity Prediction Using Ensemble Machine Learning Approaches Kapil Jindal, Niyati Baliyan and Prashant Singh Rana

Abstract At the present time, obesity is a serious health problem which causes many diseases such as diabetes, cancer, and heart ailments. Obesity, in turn, is caused by the accumulation of excess fat. There are many determinants of obesity, namely age, weight, height, and body mass index. The value of obesity can be computed in numerous ways; however, they are not generic enough to be applied in every context (such as a pregnant lady or an old man) and yet provide accurate results. To this end, we employ the R ensemble prediction model and Python interface. It is observed that on an average, the predicted values of obesity are 89.68% accurate. The ensemble machine learning prediction approach leverages generalized linear model, random forest, and partial least squares. The current work can further be improvised to predict other health parameters and recommend corrective measures based on obesity values. Keywords Obesity Accuracy



Prediction



Machine learning



Ensemble

1 Introduction Many people suffer from the problem of being overweight, i.e., the problem of obesity, without even being aware of how to check obesity, and body mass index (BMI). Obesity has multiple levels, i.e., levels 1, 2, and 3. These levels are determined by the BMI, which in turn depends on weight and height alone. However, obesity additionally depends on age and gender. For instance, if the ages K. Jindal (✉) ⋅ N. Baliyan ⋅ P. S. Rana Computer Science and Engineering Department, Thapar University, Patiala, India e-mail: [email protected] N. Baliyan e-mail: [email protected] P. S. Rana e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2018 P. K. Sa et al. (eds.), Recent Findings in Intelligent Computing Techniques, Advances in Intelligent Systems and Computing 708, https://doi.org/10.1007/978-981-10-8636-6_37

355

356

K. Jindal et al.

of two persons are 88 and 22 years, whereas the weight is the same for both ages, then the BMI is the same for both persons. However, the obesity level is different for both persons. In our knowledge, there is no mathematical formulation that explains and/or calculates this gap in obesity levels. Therefore, we are motivated to employ machine learning techniques [1] in order to achieve accurate results of obesity values in a wide variety of situations.

2 Background In the following subsections, we summarize the terminology used in our work.

2.1

Body Mass Index

Body mass index is denoted by BMI and depends on the weight and height only. If weight is in kilogram (kg) and height in meter (m), then the equation is [2]. BMI = Weight ̸ðHeight × HeightÞ If weight is in pounds (lb) and height in inches (in), then the equation changes to: BMI = ðWeight ̸ðHeight × HeightÞÞ × 703.0704

2.2

Basal Metabolic Rate

Basal metabolic rate (BMR) depends on age, weight, height, gender. It defines the rate of energy consumed by the human body. According to Harris and Benedict [3], if the gender is male, then BMR = ð13.7 × weightÞ + ð5 × heightÞ − ð6.8 × ageÞ + 66 and if the gender is female, then BMR = ð9.6 × weightÞ + ð1.8 × cmÞ − ð4.7 × ageÞ + 655 where the unit of weight is kg, unit of height is cm, unit of age is year.

Obesity Prediction Using Ensemble Machine Learning Approaches

2.3

357

Resting Metabolic Rate

Resting metabolic rate (RMR) depends on age, weight, height, gender. According to [4], the equation of RMR has two variants: one for male and the other for female. If the gender is male, then RMR = ð10 × weightÞ + ð6.25 × cmÞ − ð5 × ageÞ + 5 and if the gender is female, then RMR = ð10 × weightÞ + ð6.25 × cmÞ − ð5 × ageÞ − 161 where the unit of weight is kg, unit of height is cm, unit of age is year.

2.4

Body Fat Percentage

Body fat percentage (BFP) is researched by Deurenberg [5] and is calculated by using BMI, age, and gender. BFP has two variants: one for child and the other for adult. If the person is a child, then Fat % = ð1.51 × BMIÞ − ð0.70 × ageÞ − ð3.6 × genderÞ + 1.4 and if the person is an adult, then Fat % = ð1.20 × BMIÞ + ð0.23 × ageÞ − ð10.8 × genderÞ − 5.4 where the unit of age is year, the gender value is kept at 1 for males and 0 for females.

2.5

Protein Recommended Dietary Allowance

Protein recommended dietary allowance is denoted by protein RDA and is used to calculate daily need of protein in grams. It depends on the body weight and work done by the body. The equation for calculating protein RDA has two variants. If the person is a non-athlete, then Eq. [6] is Protein RDA = weight × 0.8 If the person is an athlete, protein RDA also depends on the amount of work done, which can range from 1.4 to 1.8.

358

K. Jindal et al.

Protein RDA = weight × range where the unit of weight is kg, unit of protein RDA is grams per day.

3 Proposed Work Figure 1 outlines our ensemble approach for obesity prediction using machine learning.

Fig. 1 Flowchart for calculating obesity value

Obesity Prediction Using Ensemble Machine Learning Approaches Table 1 Sample dataset

359

Obesity

Age

Weight

Height

Gender

BMI

0.42 1.2 3.01 3.48 2.47 1.29

20 23 25 32 57 83

65 79 108 69 128 79

157.32 157.32 162.56 126 182.88 158.49

1 1 2 1 2 1

26.26 31.92 40.87 43.46 37.37 31.45

The accuracy of models depends on the dataset, so firstly we clean the dataset for better results. If the dataset is very large, then apply feature selection. The dataset contains the following parameters with their units mentioned as age in years, weight in kilograms, height in centimeters, gender as a 1 or 0 value, BMI in weight per meter square obesity levels 1, 2, or 3. Table 1 presents sample dataset for the given problem. The user inputs only age, weight, height, gender, and athlete/non-athlete attributes and gets as output the obesity level, BMI, BMR, RMR, BFP, and protein RDA. By using this information, they can prevent many diseases that are caused by obesity. Moreover, for a pregnant lady, BMI will use the same equation, but result of obesity is different because pregnant lady will gain the average weight 25–35 lb or 11–15 kg [7]. For normal person, obesity depends on the BMI range. They have three classes of obesity—class 1, class 2, and class 3. These classes further depend on the BMI [2]. Table 2 highlights the relationship between obesity classes and BMI range. Obesity class 1 starts from BMI of 30. If we have BMI between 31 and 34.8, then the output of obesity class is 1, but by using machine learning model, we predict the value of obesity in float variable, which is more precise. Table 3 describes the division of dataset across various machine learning models in our ensemble approach. Next, we take the arithmetic mean of the output of every machine learning model and then test with the fourth part of data. The code is executed 50 times for the verification of results. We choose the model which has the best accuracy. Every model has method argument value, type, packages, and tuning parameters. Argument value is responsible for calling the function. Type defines the type of

Table 2 Obesity related to BMI BMI (kg/m × m)

Weight classification

Obesity class

Disease risk