Calculating year-over-year changes in crime by category
Often, users want to know How much did this change year over year? or …quarter over quarter?. In spite of the frequency with which these questions are asked, writing algorithms to try and answer them can be rather complex and time-intensive. Fortunately, pandas gives you much of this functionality out of the box, trivializing much of the effort.
To try and make things more complicated, in this recipe, we are going to ask the question of how much did it change by category? Adding by category into the equation will prevent us from directly using pd.DataFrame.resample, but as you will see, pandas can still very easily help you answer these detailed types of questions.
How to do it
Let’s read in the crime dataset, but this time, we are not going to set the REPORTED_DATE as our index:
df = pd.read_parquet(
"data/crime.parquet",
)
df.head()
OFFENSE_TYPE_ID OFFENSE_CATEGORY_ID REPORTED_DATE...