20th September 2023

So continuing on from my previous test using a linear model, varying to accommodate an interaction term, and then further varying it to be the log of the function gave us a higher R squared value, for me the highest so far.

So  I went ahead and wrote a function and tried to use the bootstrap method to verify the coefficients of our assumed function.

function(data , index) coef(lm(log(diabetes)~log(inactive) +log(obese) + obese*inactive , data = data, subset=index))

I now ran one bootstrap verification on the entire data set taking a data set of 363 with replacement.

Verify Bootstrap
Verify Bootstrap with the entire data

As you can see 2 different samples of the same data provided varying results, we want them to be averaged over a large number of randomly sampled data sets.

Here this was done over a varying number of times to see if it provided any benefit in finding a more precise coefficient.

 

Bootstrap with varying tries
Bootstrap with varying tries

 

As you can see it did not vary much over 10 different samples, it found the supposed coefficients somewhere between 1 and 5 different samples and their aggregates.

Leave a Reply

Your email address will not be published. Required fields are marked *