Sunday, February 07, 2010

fun with statistics

So mainline British political blogging is a horrible reality TV show, a eurobox estate crammed with CCTV cameras everyone welcomes and plays up to for the amusement of a tiny audience of media wankers and professional partisans. How about a bit of Viktorfeed data visualisation? Last weekend I wrote a little program to generate statistics from the database of flights, only really remarkable for being the first time I've used itertools.groupby() to solve a practical problem. You can get the weekly operations of every airline name that accounts for at least 1% of total activity here (CSV file, currently about 9.8KB). It's updated every eight hours. Interestingly, I note that setting a cut-off as a percentage of the total screens out essentially all the false positives.

Obviously, I threw the file at IBM ManyEyes:

As soon as I work out what my username for the Wikified version of that site is, I'll do one that loads the data dynamically.

A couple of points: There's been a big decline in no-name movements, and some operators have cashed in their chips entirely, notably BGIA. The apparent jump in traffic in early 2009 foxes me slightly; we lost a few weeks' data in the spring, but that shouldn't explain it. I suspect it's an artefact of the filtering by percentage; some operators that account for significant traffic, but that shut down, when the Antonov 12s were expelled may not have made the cut.

I'll do something similar for destinations and total activity.

No comments:

kostenloser Counter