This Transparency Project Is Creating a Massive Collection of Police Data
America police data is a fractured mess of different departments and standards, the Police Data Accessibility Project wants to change that.
Police departments generate an incredible amount of data, as interactions with cops are recorded, collated, and processed. In most states that data is technically available, but anyone who’s tried to get information out of a local police department knows how hard it can be to access that data.
The Police Data Accessibility Project (PDAP) wants to change that. The non-profit is developing software that will scrape the publicly available data of America’s 18,000 different police organizations from programs such as Palm Beach’s eCaseview, which is a court records search tool that is free for the public. The goal is to consolidate this data and make it easy for the public to access.
PDAP started 10 months ago when Kristin Tynski scraped the public records of Palm Beach County, Florida where she lives. She wanted to see if she could use the data to identify trends in policing, and give researchers and journalists a transparent view of the cops.
What Tynski found when she ran scrapers through her own county was shocking, but not surprising. In Palm Beach County, which is nearly 75 percent white according to census data, two officers gave 75 percent of their citations to non-white people, she found. About 70 percent of all violations written for having windows tinted too dark were given to minorities. Revealing these trends in big piles of police data is PDAP's mission.
She visualized some of the data, put it up on the internet and then linked to it on Reddit.“I posted about this project on Reddit, and it blew up,” Tynski told Motherboard in a DM. “I wanted to capture the momentum, because I know how much work trying to get this for most police depts will be. So, I put together a Slack community and started trying to organize.”
In the first few months, the PDAP has a huge influx of people. Thousands volunteered, but Tynski said it didn’t last. “The initial huge influx of people died off, but those that remained have been working diligently for 8 months,” she said.
PDAP’s long-term plan is to scrape data from police databases, store it somewhere secure, standardize the data, and then give the public access. Right now, PDAP is still building the programs it needs to collect data.
“There are problems of missing data, inconsistencies, and human error. In general, the data we care most about, citations/arrests, arrestee demographic info, individual officer/badge number, metadata like time/location, is available for most that we have looked at,” Tynski said. “There are some more detailed fields that seem to be unique to certain departments. Cleaning, validating, and standardizing incoming community contributed data will be part of what PDAP is responsible for.”
As more people stepped up to handle the process of scraping public records, Tynski stepped back and focused on her strength—getting the word out.
She’s happy for all the help. “I realized how big the project was while doing the scraping of my own county,” Tynski said. “It's complex, and most county systems are antiquated and filled with bugs. Getting this done —outside of some top-down government mandate—will require writing scrapers for hundreds, possibly thousands, of county websites. I would argue, it's worth the effort, and something we can get done.”
PDAP isn’t the first attempt to create a centralized database of police data. The FBI started a use-of-force database in 2018, but participation is voluntary. The Washington Post started tracking every fatal police shooting in 2015. According to the post, there’s been 988 in the past 12 months. The PDAP, once completed, would give a much wider view of how the police interact with the public.
There’s still a lot of work to do. “We are missing most of the data, we've written scrapers for a few counties in Florida,” Tynski said. “Until now we've been holding off on recruiting for, or pushing, scraping.”
She said that scraping this much data is still an emerging area of law and so the group has secured a good law firm before proceeding. “Scraping is a newish area of law and there isn’t a ton of precedent,” she said. A Canadian teen ran into trouble in 2018, for example, after scraping thousands of documents from the province of Nova Scotia's unsecured freedom of information document portal. He was charged with unauthorized access, but the charges were eventually dropped.
Tynski said she believes PDAP’s role is to collect the data, not analyze it. She leaves that to others. “We see that as the role of citizen journalists and the media,” she said. “Our goal is to facilitate accessibility to these groups, and give them the tools necessary to parse the data in useful ways.”