Police Data Trust

Backend API
Web Scraping

Data relating to police misconduct is highly decentralized. Oftentimes, data on police misconduct may not exist. If it exists, it may be unavailable to the public. If it is available to the public, it may only be available through a process of time-consuming and expensive freedom of information requests, and the responses to these requests are only available to the requester and not the public at large.
Among those datasets available to the public, each data source reporting on police misconduct may be different from other data sources reporting on the same thing. Here are just a few of the problems:
The data being collected may differ: Some may be reporting on killings by police officers, others may be reporting on officially filed complaints.
The method of collection may differ: Some data sources aggregate media articles; others take from officially filed complaints through a police department; others may even collect their own data submitted by citizens.
The schemas of the data may differ: There are lots of ways to record all the events that occur within a single police interaction with a member of the public, and data aggregators pick and choose what data they believe is salient in a format they prefer.
An example: Complaint filing systems differ based on different departments. For example, many complaint systems have encodings such as “improper use of force” or “illegal search.” These encoding systems may differ from department to department.

The Police Data Trust is a project through Code For Boston with various partner organizations, namely The Tubman Project and Code for San Jose. This project is trying to solve this problem of data in multiple places reporting on different incidents in different ways.
Our end goal is a web application with a public facing API that is queryable by members of the public to provide information from various sources. The web application would also ingest data in a consistent way across these sources. Because of how this project would connect various stakeholders, the specification design and its live implementation are major parts of the end product, as opposed to being just implementation details.