Can Data Science Save Us from the Police?

The United States is a massive, diverse country with minority populations scattered around in nearly every imaginable configuration. From the suburbs of Atlanta to Chinatown in San Francisco, America is a beautiful tapestry of humanity. However, racism is real and sometimes those in power abuse that power in ways that reflect their bigotry.

There is no doubt that minorities (particularly blacks) are unfairly targeted by the police. In the wake of Freddie Gray, that’s just an undeniable truth. Blacks are arrested more often than whites for everything from shoplifting to weed. The frequency of police encounters alone nearly insures that there will be an incident of racial violence simply because some cops are racist.

Let’s start with four assumptions, each of which could be subject to debate but I feel most Americans would agree to them.

Police departments have vast sets of data about every single call, response and report. Some police departments even put that data online for free, while others charge for it. Regardless, the data exists for the race of every police officer as well as the race of everyone they encounter and arrest.

Not So Fast

It just is the case that black officers sometimes work in overwhelmingly white communities and some white officers work in overwhelmingly black communities. In those situations, nearly everyone the police officer could arrest is of another race. Seeing that officer X is white and 95% of his arrests are hispanic tells us nothing unless we know the demographic breakdown of the community, and even then, poverty is a huge factor in criminal activity, further distorting an off-the-cuff approach to just looking at the race of the officer and race of the offender.

How to Spot a Bad Cop

Cities are not our only concern in issues of racial injustice, but they are where the vast majority of incidents occur and they offer a convenient way for us to look at the data.

Police Beats

Example of a Police Beat Map — Irving, Texas

Police departments refer to geographic areas of patrol as “beats”. The above image is a map of some of the police beats in Irving, Texas, for example. Beats are typically small parts of neighborhoods. While not homogenous in racial composition, they are at least small, economically consistent, specific geographical areas with different police officers patrolling them.

It should be the case that a bad cop’s police activity patterns would be noticeably different from a good cop’s police activity patterns in the same beat. If Bad Cop Johnny is racist against blacks, it should be the case that Johnny is pulling more blacks over, charging them more often with common crimes, having more police complaints filed against him, etc when compared with fellow officers patrolling the same beat. In short, Johnny should have a fingerprint of racism which is evident well before Johnny loses his temper and kills someone.

To Be Continued

In the coming weeks, I will dive into this and publish my own findings after examining a few datasets. Until then, I would love to hear the opinions of other data scientists or interested readers about a creative solution to this complex problem.

I ferret out things that interest me and then I write about them with fervor. Love me.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store