Due to the severely limited nature of the data that we were working with in the analyze boston data set we further moved onto the work where we would be looking into another dataset with large enough data sets because that would help us train more accurate and reliable models and provide us a way to test the model as well on the previous data.
So the alternate dataset that we are looking at is crime reporting data set because we have data across multiiple columns using multiple factors and introducing time and location data.
We would mostly like to look at the time data and try to create some sort of time series and analyse that for further insight.
We can try this but this would initiallly require a lot of data transformation because the data in the crime reporting data set is at a very granular reoprting level and considering that we have day level for the last 8 years it’s going to be very low level data. Plottng this as a timeseries would not yield much because trying to predict something like this at a daily level would not be very useful unless done very accurately.
Besides the analyses would not be of much use as it would merely indicate how crimes occurr as a function of time and that is not the reality for crime in real life because there are a variety of factors that go into it but we can highlight from historical plotting of data some sort of trends to avoid certain stretches of time.