18th September, 2023

Hello,

Starting off where I left last time, in a search of answers of some sort, the linear model was only doing so much. So we tried multiple regression variables modelling diabetes as an effect of inactivity and obesity. This showed a minor increase in the R squared term, it was roughly the same while trying any quadratic factors for the same model.

Linear Multiple Regression SUmmary
Linear Multiple Regression SUmmary

 

However when trying to explore further by introducing the interaction term for inactivity and obesity,

Summary for model with interaction terms
Summary for model with interaction terms

Here we see a further increase in the R squared factor for the model which is better for us. However in my efforts to further increase this by trying different variations of the models and factors, I tried out using log of diabetes and the predictors, to see if that could help our case.

Summary of Log transformation
Log Transformation of Linear Model

Hence it is evident that this led to a further increase in the R squared which did not seem to be happening with our higher powered terms.

Now I further introduced the interaction variable to the log transformed model
to see if it could help improve the accuracy.

Summary of log transformations and interaction terms
Summary of log transformations and interaction terms

As you can see this model produced my highest yet R squared of 0.42.

This felt like a few steps in the right direction.

Leave a Reply

Your email address will not be published. Required fields are marked *