Back to blog

Crime Predicition System

Note: Code and details not shared due to non-disclosure agreement

Introduction

Crime Prediction System is a machine learning model which predicts the probability of future crime by learning the patterns from the historical crime data. We can use many different algorithms for creating our model. Each model has plus and minuses; from traditional machine learning algorithms like decisions trees to naive bayes to sophisticated deep learning algorithms like neural nets.  The algorithms I used for this particular problem was XGBoost from the scikit learn package and Artificial Neural Nets in Keras due to the type of data, volume of data and the accuracy of the model pertaining to the data set.

Model Formulation

The first part of the problem is to clean the data. The data is wrangled and munged from a SQL database and fed into a python array. Later we use ARC GIS to find the exact latitude and longitude points from the map where the particular crimes were committed. After that we need to classify the crimes by their respective labels. This is called feature selection, we need to run algorithms to find the weight of labels as they correlate to the data. Murder, car theft, robbery, etc. are also bundled into their respective arrays. After that we need to extract the time from the date function. The date of crime consists the following information (date, time, day, year, month). For our model we need to extract only the time as the rest of the features do not correlate heavily with the rest of the data. After we have the time we need to divide them into a 3 hour bucket to find the correct patrolling times to prevent such crimes in the future. We will need to make eight 3-hour windows for the 24 hour period. After we have correctly labelled and collected the data we need to find the dependent and independent variables. We also have to encode the data and convert the text data into binary data for the model to consume. When we run the model we need to do model validation by creating a test and train set. After that we have to fine tune our model by using various tips and tricks. After we run a model we will get an accuracy from the confusion matrix.

Conclusion

The crime prediction system can be tuned to give hourly recommendations to law enforcement agencies for patrolling and surveillance. The system is a self evolving algorithm which will become accurate as more and more data is fed into it.

Back to blog