Tag Archives: open data

The age of housing in the United States

The Census Bureau recently released the results of the 2011 American Housing Survey. One noteworthy point for me was how old housing is in the U.S. The data basically shows a rough bell curve peaking between 1950 and 1979. The median year was 1974, 34 years ago. This number has crept up since 1985, when the median house age was only 23, according to the National Association of Home Builders.

age of houses in the united states

For more analysis on this issue, check out this oldhouseweb.com article from a few years ago. It shows which states had the highest concentration of old housing at that time. A surprising amount of it is in the Midwest, with Michigan, Illinois, Wisconsin, Indiana, and Ohio all in the top ten.

The varying costs of medical services by state

health symbolEarlier this summer, the Center for Medicare & Medicaid Services published interesting data on the charges submitted by hospitals in all 50 states for 30 different outpatient services. The Center published the amount ultimately paid for the services, and similar data for 100 different inpatient services.

As an example, I decided to create a visualization showing the varying average costs of one type of outpatient service, “Level II Cardiac Imaging,” by state (except for Maryland, which wasn’t included in the dataset). Keep in mind as you look at these charges that Medicare paid out a national average of $744.58 for the procedure, even though the national average submitted charge was more than $4,000. After clicking the image below, you can see exact amounts by hovering over a particular state:


As you can see, the amounts charged value greatly. In my view, these differences confirm at least two things. First, this data shows just how easy it might be to submit overcharges, given the wide discrepancies, and thus why the Obama administration saw a pressing need to crack down on Medicaid and Medicare fraud.

This data also unmasks America’s broken system for pricing medical procedures. Wired ran a great piece last year about this problem. The article noted that “a recent study of the costs of routine appendectomies performed throughout California” showed that, for nearly the same procedure, “the charges varied more than 100-fold—from $1,529 at the cheapest to $182,955 at the most expensive.” The article concluded that “Job One” was “transparency in treatment, cost, and institutions.” This data release is not perfect transparency, but it’s a good start.

Apparently, one reason for the price differences may be the migration of these services from doctors’ offices to hospital outpatient departments, which often charge more. Complicated, eh?

Legal aspects of big data

big data sheriff's star“Big Data”— the business-world buzzword for the collection and analysis of massive amounts of data—has caught on with local government officials in the past few years as many cities have developed extensive data portals providing citizens access to heaps of public information like data from 311 calls. And its not only local governments getting involved, in 2012 the White House announced the “Big Data Research and Development Initiative,” through which federal agencies would commit funding toward collecting and analyzing “huge volumes of digital data.”

So, what sparked this interest in “Big Data”? In short, innovations in computing, particularly the ability to allow users to remotely access large data sets stored on third-party servers, i.e., “cloud computing.” But as attorney John Pavolotsky wrote last November in Business Law Today, “[w]hile business publications have written widely about Big Data, legal commentators have written sparingly on the subject.”

Pavolotsky goes on to note three areas of legal concern he see with Big Data: privacy, data security, and intellectual-property rights. He does not dwell long on data security and IP rights for long, other than to note that API licenses should be reviewed carefully to determine the permissible scope of data distribution. He focuses instead on privacy, arguing that, because of the “inherent squishness” of the legal standard applied to collection of cellphone or GPS data under the Fourth Amendment—which protects guards people’s “reasonable expectation of privacy”—perhaps legislatures should limit the length of time data can be stored.

I’d like to add one other interesting question surrounding Big Data, though it leans more economic than legal: whether data collection should remain public or be privatized. This issue comes up with vacant-property registration, which I’ve written about before, as many local governments allow registration through MERS, a private company, rather than directly through local data systems. Government officials are then provided access to MERS.

Privatization of data collection and analysis provides many benefits, particularly in that it is cost-effective for local government to take advantage of an already developed platform. The primaru draw back, however, is the risk of industry capture, as with MERS and its association with the mortgage industry.

The best solution, when available, is for local governments to take advantage of open-source programs or nonprofit developers (as available through Code for America). Otherwise, there are companies like Socrata that, as far as I can tell, are not closely associated with any industry other than the cloud-computing and data-collection industry.

Interactive Map of Chicago Crime, Ward by Ward

As a follow up to yesterday’s post about the SimCity zoning map of Chicago, I’d like to point out that the map is a collection of interactive apps developed by Open City based largely on information from Chicago’s data portal. A couple of other apps by this project that I think are really interesting are the Crime in Chicago app, which lets you compare crime ward by ward, and breaks down the most frequent crime and even the time of day it was committed, and the How’s Business app,  which gives a snapshot view of economic indicators pulled from various sources. These are the types of innovative apps that show the powerful potential for open data.

Statistics on Crime in Chicago Wards

Cook County webcast this Friday on new Socrata Data Portal

Here’s an exciting announcement for those of us in Cook County. The County is following the City of Chicago’s lead to create a state-of-the-art data portal at data.cookcountyil.gov. I’m particularly interested in the “courts” data. Here’s a press release for webcast about the portal this Friday (and a couple of other events that are part of Cook County’s “Open Data Week”).

On Friday, April 27th, the County, the State of Illinois, and Socrata, a data archive company, will live stream a webcast on the new regional data portal, MetroDataChicago.org (http://metrochicagodata.org).  During the broadcast, viewers will learn more about how the portal works, how to use data found there, and what are the goals of the County and State going forward.

The County will also release new and updated datasets.

The County is also partnering with global Big Data Week (http://bigdataweek.com) for an international angle to local data.  Big Data is an emerging data science that allows organizations to analyze very large datasets, find patterns, create predictive models, and help understand more about the vast amounts of data generated by the public.  During Big Data Week, Cook County is hosting a webinar showcasing the projects and achievements of local Big Data developers on April 27th, and co-sponsoring a hackathon competition on April 28th.

Urbanflow’s view of a touchscreen-filled data-driven city – Video Wednesday

This week’s video is a fun one; it’s a concept video about Urbanflow, a collaboration between a NYC-based design company Urbanscale and Finnish designers Nordkapp. They want touchscreens everywhere in Helsinki, so the public can access maps and transit info, interact with each other, and report municipal concerns like potholes. John Pavlus has raised some concerns about their plan—you can read his view here—but for now, just enjoy their beautiful visualization of the data-driven city of the future (after the jump).

Continue reading

Are cities America’s greatest laboratories of government innovation?

In 1932 Supreme Court Justice Louis Brandeis famously wrote, “It is one of the happy incidents of the federal system that a single courageous State may, if its citizens choose, serve as a laboratory; and try novel social and economic experiments without risk to the rest of the country.” And indeed, the States remain a powerful laboratory, able to innovate constrained only by the U.S. Constitution.

But the quote applies ever more to cities. Although they sometimes lack deep pockets, and their authority to regulate is derived from the state, they offer many advantages over state and federal government. I’ll list four. First, as Benjamin Barber recently argued, cities tend to be more pragmatic and less ideological than other levels of government, meaning real innovation gets done in a practical way. Second, as Edward Glaeser points out in “Triump of the City,” cities are smaller geographically and dense with human potential, which accelerates the spread of ideas. Third, comparatively minimal bureaucracy allows cities to respond quickly to changing technology. Four, if an initiative fails, it doesn’t affect a whole state (or the whole country).

Put differently, cities are the tech start-ups of government; the federal government is Microsoft. As Arianna Huffington recently wrote, “It’s our cities, not the nation’s capital, that are the real idea factory of our country.”

I’ll give two examples.

First, one many people are familiar with—open-data initiatives. In 2007 Vivek Kundra, then Assistant Secretary of Commerce and Technology for Virginia, became Washington D.C.’s Chief Technology Officer. He created the D.C. Data Catalog, making government data available for open-source application development. He also instituted an app contest, using a pot of money to crowdsource innovation. When Obama became president, he drafted Kundra as Chief Innovation Officer, where he created data.gov, an initiative to provide an accessible online catalog of data from federal agencies. The idea has spread like wildfire, with more and more cities creating open-data sites and sponsoring app contests. And the data.gov model has now caught on in more than 13 countries. The world has been changed, with momentum generated by a city that was willing to embrace new ideas and showcase on a small scale what would eventually become a worldwide movement.

Second, an example from an area of interest to me—vacant-property registration. In 2007 Chula Vista, California, enacted an ordinance that took a novel approach in the fight to maintain vacant properties: Rather than simply requiring property owners to register vacant property, the city required mortgage lenders to register property when it went into foreclosure and then to maintain the property to stringent code guidelines. Chula Vista’s code enforcement officer, Doug Leeper, was particularly vigilant, and during the program’s first year of operation, Chula Vista raised $77,000 in registration fees (at $70 per property) and imposed around $850,000 in administrative citations. The program was such a success that Leeper was called to testify before the House of Representative’s subcommittee on Domestic Policy in 2008 in the wake of the foreclosure crisis. Again, one city’s program spread like wildfire. By my count, nearly 100 municipalities had enacted a similar ordinance by 2009, and that number has continued multiplying ever since. Recently, even Chicago modified its registration ordinance to target lenders.

This type of innovation comes only from local thinkers (and doers) living in communities, seeing local problems, and testing solutions in perhaps America’s greatest laboratories for government innovation—our cities.


Comparing effect of teacher scores and parent engagement on Chicago student math scores

Here’s a chart from Chicago open data about the progress report for Chicago Public Schools for the 2011 to 2012 school year.

Here’s what I did. I narrowed the list of schools down to 208, including only those that had data in these three categories: ISAT (Illinois Standard Achievement Test) Exceeding Math %, parent engagement score, and teacher score. (I don’t know how the progress report scored teachers or parent engagement.) I then charted the math achievement score against and the parent-engagement and teacher scores, and added a trend line.

Here’s what it shows: Both parent engagement and high teacher scores are important to achievement; there is an upward trend in math scores for both variables. Teacher score, however, seems to have the stronger effect, according to this data. Never underestimate the power of a good teacher.

The trend with ISAT Exceeding Reading scores is substantially the same as the math-scores trend.


Featured Website: LOVELAND Technologies

loveland technologiesLOVELAND Technologies is a neat project out of Detroit, Michigan, that is selling micro-lots of land in “microhoods” for $1 per square inch that people can track online. They focus on making these microhoods exciting by generating artsy urban-renewal projects. According to their website, they “aim to provide a fun, game-like ownership experience while creating entertainment fundraising, community collaboration, and social mapping tools that work at any scale.” They got started a few years ago through Kickstarter.

They have a few other projects. There’s online mapping projects (in collaboration with Data Driven Detroit) and a “LoveTax” system, a creative way for people to fund projects. They also have a cool online app called “Why Don’t We Own This?” that tracks more than 40,000 vacant properties owned by the city, state, or county. The Huffington Post recently reported that this year’s Code for America fellows in Detroit will be building off the momentum that project has created. Overall, a great Detroit project to check out.

For more info, the founders gave a presentation at a TEDx conference in Detroit in 2010 that I’ve embedded after the jump.

Continue reading

How to misuse 311 data

The Bed-Bug (Cimex lectularius)

I read a recent article in the Chicago Reader about continuing bed-bug infestations that inspired me to comment on an easy-to-make mistake regarding 311 data. Here’s what the article says:

The City of Chicago’s Department of Buildings tracks the number of bedbug infestations reported through 311 calls, and reports [an upward trend].

The department started keeping a record in 2006; there were 25 calls that year, 50 the next, and 103 in 2008. Since then the number of calls has increased by roughly 100 each year, totaling 376 in 2011.

(This information was further highlighted in an infographic embedded in the article.)

Here’s the problem: The author seems to suggest that the increase in 311-bed-bug reports is evidence of an increasing bed-bug problem, but the number of 311 calls per year fluctuates. So it’s impossible to know whether there are more bed bugs or whether simply, for some other reason, more people thought to call 311 in a given year. Perhaps a local tv station publicized 311 that year, thus driving up calls. I was unable to find reliable data on the total number of 311 calls for 2006 to 2011, but I know the numbers for 2008 (4,533,125) and 2009 (4,136,505), showing that the yearly call volume can vary by nearly $400,000 year-to-year.

The better metric would be the increase in the ratio of bed-bugs reports to total 311 calls. At least that would account for the possibility that people were just using 311 more in general during a certain year. Based on my research, I still think there is an upward trend, though maybe not for 2010 to 2011, when the increase in bed-bug calls was only 76 calls.

One resource I find particularly helpful on matters like this is Darrell Hunt’s classic “How to Lie with Statistics,” which teaches the reader, in a fun and readable way, to be skeptical of how of the media presents data. It should be required college reading.