Saturday 9 January 2010

Analytics X-Prize, Part 1

I came across the Analytics X-prize while reading hacker news last week. The challenge is to predict the distribution of homicides in the city of Philadelphia over the course of 2010. It did cross my mind to wonder if this is a FBI sting of some sort*. Assuming it's legit though I thought I'd give it a go. Not being American and knowing nothing about crime beyond what I've seen on The Wire will make it an interesting case study.

The whole thing has a slightly macabre aspect I suppose, and reminds me a little of the ill-fated Policy Analysis Market. Still, the stated intention is to drive development of tools which will ultimately save lives, so I guess we're alright ethically.

Okay then, first thoughts:
  • The number of homicides has decreased in recent years, from 406 in 2006 to 305 in 2009. This could be for many reasons, but it shows time dependence and implies that historic data may not necessarily be a reliable indicator of future trends.
  • Narrowly defined, the brief is to predict the number of homicides in 47 zip-code districts. The numbers per district are likely to be low and finite small effects may be significant.
  • More widely defined, the brief is to probe the causal factions of homicide. There are a few data sources suggested in the X-Prize forum, but at the moment these isn't any geo-coded data available. More than that though I need data about the socio-economics of Philadelphia.
  • I need to do some serious background reading.
* There's also a schlock thriller waiting to be written featuring a killer nerd stalking the streets of a major American city "adjusting" the murder rate to fit his model...

No comments:

Post a Comment

/** google analytics */