Predicting BMI Using ONLY Health/Lifestyle Factors (Height and Weight Excluded)
Part 1 of 2 — Linear Regression
This model and the accompanying research is strictly for research purposes and should NOT be used in lieu of speaking to a health professional regarding nutritional, fitness, and lifestyle changes
Many people have complicated relationships with their bodies and their weight. As someone, who has worked as a professional personal trainer, group fitness instructor, and health club manager, I am sensitive to this and intend on approaching this topic with the utmost sensitivity. This blog post is part 1 of a 2 part series of utilizing high quality models to look at the data surrounding the factors that can cause someone to be underweight, “normal weight”, overweight, or obese. None of the information in this post will be used to belittle or shame anyone in any of the above populations. This post will only observe how well different models can predict BMI when only given healthy and lifestyle choices but not height or weight.
BMI? But BM-Why?
BMI stands for Body Mass Index and is one of the most common ways of measuring what weight class a person belongs in. It’s generally measured as: (weight in kilograms)/(height in meters * height in meters). Because BMI is able to to be taken with only a person’s height and weight, we will exclude these two variables from our model.
This dataset came from a study conducted by Fabio Mendoza Palechor and Alexis de la Hoz Manotas at the Universidad de la Costa, CUC, Colombia for estimating obesity levels based on eating habits and lifestyle choices. The dataset can be explored here.
Because BMI’s are numerical and continuous (each weight class represents a range of BMI numbers), this first post will attempt to create a model that can accurately predict a person’s BMI number given only health and lifestyle factors. This kind of model is called a Linear Regression model which (for the non-math population) simply means that the model will attempt to find a linear relationship between BMI and the variables I will feed into it. After being trained on a large amount of data, it will attempt to make predictions about people it has never seen before based purely on their lifestyle/health choices.
After training and fine tuning my linear regression model, the final model utilized the following features which are listed in ranked importance to the model:
- Does obesity/overweightness run in your family history?
- An answer of yes significantly rose the level of BMI predicted by the model
2. Age (in years)
- Older ages significantly rose the level of BMI predicted by the model
3. How do you transport to work?
- Public Transportation significantly rose the level of BMI predicted by the model. Motorbikes and Bicycles had a very small effect on the model. Walking had little to no effect on the model.
4. How frequently do you drink alcohol?
- An answer of never significantly dropped the level of BMI predicted by the model. With answers of sometimes or frequently having a similar relationship albeit far less significant
5. How frequently do you consume vegetables?
- Higher frequencies significantly rose the level of BMI predicted by the model
6. How frequently do you consume snacks?
- An answer of frequently significantly dropped the level of BMI predicted by the model. An answer of sometimes or never had the opposite effect but was a lot less significant
7. Do you frequently consume high calories foods?
- An answer of yes rose the level of BMI predicted by the model
8. How frequently do you exercise?
- Higher frequencies dropped the level of BMI predicted by the model
9. Do you monitor your calories?
- An answer of yes dropped the level of BMI predicted by the model
10. How many liters of water do you drink in a day?
- Higher numbers of liters rose the level of BMI predicted by the model
11. Are you a male?
- Of special note here is that the study only included male and female in their gender question. It is not clear whether anyone transgender was part of the study or if anyone who identifies outside of the gender binary was included or whether those options were included in the surveys/interviews. An answer of yes to the question of are you a male dropped the level of BMI predicted by the model but the effect was very small.
12. How many main meals do you eat in a day?
- Higher frequencies rose the level of BMI predicted by the model but the effect was very small
The study did include further lifestyle and health questions but the ones not included were discarded because they either harmed the model’s predictions or had no effect on the model’s predictions.
So, Were the Model’s Predictions Accurate?
When tested on data it had never seen before, the model determined that lifestyle/health factors explained about 43.18% of the variability of the data. This means that while this model wouldn’t be good to deploy for predicting accurate BMI levels, it does show that health/lifestyle factors play a very significant role in our BMI levels. Furthermore, this shows that no single lifestyle/health choice is the root cause for any BMI level (Of course, there are likely rare medical exceptions). This also means that any of the considerations in the features above should be considered with a grain of salt and not necessarily taken as absolute truth (people are very diverse and different and often different things affect people differently).
What’s Part 2 Of This Post Going To Be About?
Part 2 will be about translating the BMI numbers to their respective classes and making them categorical. Afterwards, I can run multiple classification models on this data and see if we get a better performing model. WOOT! WOOT!
Sounds Awesome! Can I Try This Model Out?
Due to the contentious relationship that many people have with their own bodies and weights (especially amidst the pandemic) combined with a model who’s predictability power is lacking a bit, it would be irresponsible to code a function for this specific model for anyone to try and could potentially cause more harm than good. However, if you’re interested in the model for research purposes (and have a background in Python) then feel free to reach out and I would be more than happy to send over my Jupyter Notebook with all the steps I took and the model itself.
Coolio Hermano! Your Blog Post Made Me Think Of Someone I Care About Whose BMI/Weight Appears Concerning To Me. How Should I Approach This Conversation?
Short version, you shouldn’t. It’s rarely ok to comment on anyone’s weight (even complimenting someone’s weight loss/gain can be harmful because it reinforces the stigma that they are worth less when not at a ‘normal’ weight). Another person’s weight (or BMI) is really only the business of themselves, their medical adviser, and anyone they choose to relay that information to.
If you, yourself, are struggling with an eating disorder, please contact the National Eating Disorder Association helpline at https://www.nationaleatingdisorders.org/help-support/contact-helpline.