20th September 2023 – gautammarathe

So continuing on from my previous test using a linear model, varying to accommodate an interaction term, and then further varying it to be the log of the function gave us a higher R squared value, for me the highest so far.

So I went ahead and wrote a function and tried to use the bootstrap method to verify the coefficients of our assumed function.

function(data , index) coef(lm(log(diabetes)~log(inactive) +log(obese) + obese*inactive , data = data, subset=index))

I now ran one bootstrap verification on the entire data set taking a data set of 363 with replacement.

As you can see 2 different samples of the same data provided varying results, we want them to be averaged over a large number of randomly sampled data sets.

Here this was done over a varying number of times to see if it provided any benefit in finding a more precise coefficient.

As you can see it did not vary much over 10 different samples, it found the supposed coefficients somewhere between 1 and 5 different samples and their aggregates.

Leave a Reply Cancel reply