One of the speakers at InfoWorld's SOA Executive Forum in New York last fall was Dan Thomas, director of the DCStat program in Washingon's Office of the CTO. Earlier this month, he alerted me to a remarkable development. Starting in mid-June, the District of Columbia would begin releasing operational data from a variety of city agencies to the Internet in several XML formats, including RSS and Atom.
"Our expectation is that it will spawn mashups, analysis, and who knows what ripple effects," Thomas wrote. "We also expect it will motivate government agencies to seek and sustain high levels of performance."
On June 12 the first of the feeds -- data on the disposition of service requests received by the Mayor's Call Center and the online Service Request Center-- was quietly launched at the Center for Innovation and Reform (http://cir.oca.dc.gov/cir/). I immediately grabbed the data, and in a few hours I had cobbled together a proof-of-concept mashup that displays requests related to street repaving and gutter repair on a map of the District of Columbia. If you've ever visited Adrian Holovaty's award-winning ChicagoCrime.org, you can see what this might mean for Washington.
Here's a critical difference, though. Holovaty had to devote a considerable amount of effort to screen scraping the Chicago Police Department's Citizen ICAM Web site in order to extract the data -- and still more effort to geocode it. I'm sure that while he was writing that screen scraper he was mentally screaming: "Just give me the data!"
DCStat is doing just that. The Atom and RSS feeds summarize activity, and all the details -- including latitude and longitude -- are included in DCStat's own XML format. Following the initial launch of the service request feed, new ones will appear at roughly two-week intervals throughout the summer and fall. These feeds will contain raw operational data about crime, property, housing code enforcement, and business and liquor licensing.
There are some loose ends. Although it's true that XML is a self-describing format, there is as yet no documentation to guide developers who want to build applications on top of the data, or analysts who want to interpret it. And because the first monthly cycle isn't yet complete, it's not obvious how to mesh daily, weekly, and monthly dumps. I expect these questions will be resolved soon, though.
From one perspective this is a great SOA success story. In an e-mail to me describing the DCStat architecture, Dan Thomas mentioned many of the buzzwords familiar to cognoscenti: EAI, ETL (extraction, transformation, and loading), GIS (Geographic Information System), ESB, XML, RSS. But these are all just means to an end. And in this case, it's a particularly inspiring end: government services that are open and accountable, the performance of which can be measured.
For my Friday podcast last week, I spoke with Dan Thomas and District of Columbia CTO Suzanne Peck, one of InfoWorld's 2005 CTO 25 awardees, about the potential impact of this open government initiative. They told me that the key enablers were a bunch of three-letter-acronym technologies that have recently matured, and that's true. On top of all that complex infrastructure, however, there's a simple yet profound idea. Government is us, and its data is our data. Reflect it back to us, and good things will happen.