-
Lol looks like this is based off the police blotter in Chicago. So, if you've ever been a local journalist and are familiar with the police blotter system you're going to start identifying significant problems with this approach immediately
-
Most precincts do not make blotter information available easily. It's rarely available in any sort of digital format. Event logs like these are self reported by cops, so :/. Unclear which events this project is using, but also very notable that blotters are pre formal charging.
-
Chicago is apparently turning these into some sort of finished product over a week that might help with some of these issues, but if it is based off blotters what it's really doing (to the extent it's very tenuously theoretically doing anything) is predicting police activity...
-
They're not staying how they've verified it's supposed predictions but, sure seems like it's against the same event logs, in which case there's no ability to account for misrepresenting of events, crimes, or sketchy shit that sometimes ends up altered or removed from the blotter.
-
For example, when we worked with the precinct that covered my college campus we'd often find a variaty of arrests and crimes would never make it to the blotter because they didn't 'want to ruin kids lives' so guess what type of incidents would most often go unrecorded?...
-
On the flip side NYPD is facing allegations cops are making up crimes to make arrests at the end of their shifts to make overtime money doing paperwork. All those would end up in the log, even if they resolved to nothing. But a great and easily predictable pattern for ML...
-
This all reflects the most common and biggest error would-be crime stoppers with PhDs in computer science and no other qualifications relevant to crime or politics face: police activity is *not* crime activity...
-
Blotters make for a poor data source because the input is very possibly untrustworthy and the data itself (even over a week) is *unfinished* as much needs to be done to successfully complete it and assure a crime actually happened.
-
A lot of newspapers used to publish blotters and police logs of that sort and they've stopped because of accuracy issues and because they'd often mention names that would get published in connection to a crime only to much later be found out to be incorrectly arrested.
-
None of this accounts for the crimes that simply never get engaged with by the police. The calls that don't get responded to, or the crimes that simply don't get reported for numerous reasons. The result will simply reinforce whatever cop decisions are made around deployment...
-
& those are only the problems I've thought of in the first 30m of thinking about the many possible problems. That said, you know what this sort of predictive model would be great for? Avoiding cop malfeasance as a citizen. Of course, we the people wouldn't ever get access tho.
-
If you are designing algorithms or machine learning to help the cops Maybe just don't.