30th October, 2023

Continuing from last time, my major ideas were to see if we could find any correlation between the density of shootings by having a related plot of population density with it.

This would help us identify more easily if more shootings have been taking place in more populated areas and in a way we could be able to work out if the frequency of shootings is a function of the population of the people.

Trying to code this has some challenges that I have been facing because I can’t seem to really actualize the data I’m looking for in terms of population densities hence it has been a challenge.

A lot of popular python libraries contain data regarding county level population, but I can’t plot data at a county level when looking at a state level. A lot of population data online however does not have coordinates but rather names for the cities which has been another challenge.

Trying to figure the two out for now and coding the lookups required to match the data do not seem to be working as of right now, but I am working on fixing it because I do find it an interesting direction to be going in.

Furthermore I would maybe like to have police station coordinate data plotted to see if there is any pattern of data there in terms of the distances from police stations.

 

The code below is not yet fully working but It’s a reference point for the work later

import pandas as pd
import matplotlib.pyplot as plt
import geopandas as gpd
import cartopy.crs as ccrs
import cartopy.feature as cfeature
from sklearn.cluster import KMeans
import us
# Specify the full path to your Excel file using a raw string
excel_file_path = r’C:\Users\91766\Desktop\fatal-police-shootings-data.xlsx’
# Read data from Excel file
df = pd.read_excel(excel_file_path)
# Extract latitude and longitude columns
latitude_column = ‘latitude’  # Replace with your actual column name
longitude_column = ‘longitude’  # Replace with your actual column name
latitudes = df[latitude_column].tolist()
longitudes = df[longitude_column].tolist()
# Perform K-means clustering
kmeans = KMeans(n_clusters=5, random_state=42)
df[‘cluster’] = kmeans.fit_predict(df[[latitude_column, longitude_column]])
# Create a map of the USA using Cartopy
fig, ax = plt.subplots(subplot_kw={‘projection’: ccrs.PlateCarree()}, figsize=(12, 9))
ax.set_extent([-125, -66, 24, 49])  # USA bounding box
# Plotting the coordinates with cluster colors
scatter = ax.scatter(df[longitude_column], df[latitude_column], s=10, c=df[‘cluster’], cmap=’viridis’, marker=’o’, alpha=0.7, edgecolor=’k’, transform=ccrs.Geodetic())
# Add colorbar
cbar = plt.colorbar(scatter, ax=ax, orientation=’vertical’, fraction=0.03, pad=0.05)
cbar.set_label(‘Cluster’)
# Add map features
ax.coastlines(resolution=’10m’, color=’black’, linewidth=1)
ax.add_feature(cfeature.BORDERS, linestyle=’:’)
# Get and plot capital cities using the us library
for state in us.STATES:
    capital = us.states.lookup(state.capital)
    ax.text(capital.longitude, capital.latitude, state.capital, transform=ccrs.PlateCarree(), fontsize=8, ha=’right’, va=’bottom’, color=’blue’)
# Draw state lines
ax.add_feature(cfeature.STATES, linestyle=’-‘, edgecolor=’black’)
# Show the plot
plt.title(‘K-Means Clustering on the Map of the USA with State Capitals Highlighted’)
plt.show()

Leave a Reply

Your email address will not be published. Required fields are marked *