Half empty or half full? Mixed reactions to Pew research on open data and open government

Yesterday, I wrote up 15 key insights from the Pew Internet and Life Project’s new research on the American public’s attitude towards open data and open government. If you missed it, what people think about government data and the potential impact of releasing it is heavily influenced by the prevailing low trust in government and their politics.

mixed-hopes-open-data-improve-pew

Media coverage of the survey reflected the skepticism of the reporters (“Most Americans don’t think government transparency matters a damn“) or of the public (“Who cares about open data” and “Americans not impressed by open government initiatives“). This photo by Pete Souza below might be an apt image for this feeling:

not-impressed-souza-obama

Other stories pulled out individual elements of the research (“Open data on criminals and teachers is a-okay, say most US citizens” or mixed results (“People Like U.S. Open Data Initiatives, But Think Government Could Do More” and “Sorry, open data: Americans just aren’t that into you“) or general doubts about an unfamiliar topic (“Many Americans Doubt Government Open Data Efforts“). At least one editor’s headline suggested that the results were an indictment of everything government does online: (“Americans view government’s online services and public data sharing as a resounding ‘meh’.) Meh, indeed.

As usual, keep a salt shaker handy as you browse the headlines and read the original source. The research itself is more nuanced than those headlines suggest, as my interview with the lead researcher on the survey, John Horrigan, hopefully made clear.

Over at TechPresident, editor-in-chief Micah Sifry saw a glass half full:

  • Digging deeper into the Pew report, it’s interesting to find that beyond the “ardent optimists” (17% of adults) who embrace the benefit of open government data and use it often, and the “committed cynics” (20%) who use online government resources but think they aren’t improving government performance much, there’s a big group of “buoyant bystanders” (27%) who like the idea that open data can improve government’s performance but themselves aren’t using the internet much to engage with government. (Heads up Kate Krontiris, who’s been studying the “interested bystander.”)
  • It’s not clear how much of the bystander problem is also an access problem. According to a different new analysis done by the Pew Research Center, about five million American households with school-age children–nearly one in five–do not have high-speed internet access at home. This “broadband gap” is worst among households with incomes under $50,000 a year.

Reaction from foundations that have advocated, funded or otherwise supported open government data efforts went deeper. Writing for the Sunlight Foundation, communications director Gabriela Schneider saw the results from the survey in a rosy (sun)light, seeing public optimism about open government and open data.

People are optimistic that open data initiatives can make government more accountable. But, many surveyed by Pew are less sure open data will improve government performance. Relatedly, Americans have not quite engaged very deeply with government data to monitor performance, so it remains to be seen if changes in engagement will affect public attitudes.

That’s something we at Sunlight hope to positively affect, particularly as we make new inroads in setting new standards for how the federal government discloses its work online. And as Americans shift their attention away from Congress and more toward their own backyards, we know our newly expanded work as part of the What Works Cities initiative will better engage the public, make government more effective and improve people’s lives.

Jonathan Sotsky, director of strategy and assessment for the Knight Foundation, saw a trust conundrum for government in the results:

Undoubtedly, a greater focus is needed on explaining to the public how increasing the accessibility and utility of government data can drive accountability, improve government service delivery and even provide the grist for new startup businesses. The short-term conundrum government data initiatives face is that while they ultimately seek to increase government trustworthiness, they may struggle to gain structure because the present lack of trust in government undermines their perceived impact.

Steven Clift, the founder of e-democracy.org, views this survey as a wakeup call for open data advocates.

One reason I love services like CityGram, GovDelivery, etc. is that they deliver government information (often in a timely way) to the public based on their preferences/subscriptions. As someone who worked in “e-government” for the State of Minnesota, I think most people just want the “information” that matters to them and the public has no particular attachment to the idea of “open data” allowing third parties to innovate or make this data available. I view this survey as a huge wake up call to #opengov advocates on the #opendata side that the field needs to provide far more useful stuff to the general public and care a lot more about outreach and marketing to reach people with the good stuff already available.

Mark Headd, former chief data officer for the City of Philadelphia and current developer evangelist for Accela software, saw the results as a huge opportunity to win hearts and minds:

The modern open data and civic hacking movements were largely born out of the experience of cities. Washington DC, New York City and Chicago were among the first governments to actively recruit outside software developers to build solutions on top of their open data. And the first governments to partner with Code for America – and the majority over the life of the organization’s history – have been cities.

How do school closings impact individual neighborhoods? How do construction permit approvals change the character of communities? How is green space distributed across neighborhoods in a city? Where are vacant properties in a neighborhood – who owns them and are there opportunities for reuse?

These are all the kinds of questions we need people living and working in neighborhoods to help us answer. And we need more open data from local governments to do this.

If you see other blog posts or media coverage that’s not linked above, please let me know. I storified some reactions on Twitter but I’m certain that I missed conversations or opinions.

few-think-govt-data-sharing-effective-pew

There are two additional insights from Pew that I didn’t write about yesterday that are worth keeping in mind with respect to how how Americans are thinking about the release of public data back to the public. First, it’s unclear whether the public realizes they’re using apps and services built upon government data, despite sizable majorities doing so.

Second, John Horrigan told me that survey respondents universally are not simply asking for governments to make the data easier to understand so that they can figure out what I want to figure out: what people really want is intermediaries to help them make sense of the data.

“We saw a fair number of people pleading in comments for better apps to make the data make sense,” said Horrigan. “When they went online, they couldn’t get budget data to work. When the found traffic data, couldn’t make it work. There were comments on both sides of the ledger. Those that think government did an ok job wish they did this. Those that thin government is doing a horrible job also wish they did this.”

This is the opportunity that Headd referred to, and the reason that data journalism is the critical capacity that democratic governments which genuinely want to see returns on accountability and transparency must ensure can flourish in civil society.

If a Republican is elected as the next President of the United States, we’ll see if public views shift on other fronts.

Data journalism and the changing landscape for policy making in the age of networked transparency

This morning, I gave a short talk on data journalism and the changing landscape for policy making in the age of networked transparency at the Woodrow Wilson Center in DC, hosted by the Commons Lab.

Video from the event is online at the Wilson Center website. Unfortunately, I found that I didn’t edit my presentation down enough for my allotted time. I made it to slide 84 of 98 in 20 minutes and had to skip the 14 predictions and recommendations section. While many of the themes I describe in those 14 slides came out during the roundtable question and answer period, they’re worth resharing here, in the presentation I’ve embedded below:

[REPORT] On data journalism, democracy, open government and press freedom

On May 30, I gave a keynote talk on my research on the art and science of data journalism at the first Tow Center research conference at Columbia Journalism School in New York City. I’ve embedded the video below:

My presentation is embedded below, if you want to follow along or visit the sites and services I described.

Here’s an observation drawn from an extensive section on open government that should be of interest to readers of this blog:

“Proactive, selective open data initiatives by government focused on services that are not balanced by support for press freedoms and improved access can fairly be criticized as “openwashing” or “fauxpen government.”

Data journalists who are frequently faced with heavily redacted document releases or reams of blurry PDFs are particularly well placed to make those critiques.”

My contribution was only one part of the proceedings for “Quantifying Journalism: Metrics, Data and Computation,” which you can catch up through the Tow Center’s live blog or TechPresident’s coverage of measuring the impact of journalism.

On data journalism, accountability and society in the Second Machine Age

On Monday, I delivered a short talk on data journalism, networked transparency, algorithmic transparency and the public interest at the Data & Society Research Institute’s workshop on the social, cultural & ethical dimensions of “big data”. The forum was convened by the Data & Society Research Institute and hosted at New York University’s Information Law Institute at the White House Office of Science and Technology Policy, as part of an ongoing review on big data and privacy ordered by President Barack Obama.

Video of the talk is below, along with the slides I used. You can view all of the videos from the workshop, along with the public plenary on Monday evening, on YouTube or at the workshop page.

Here’s the presentation, with embedded hyperlinks to the organizations, projects and examples discussed:

For more on the “Second Machine Age” referenced in the title, read the new book by Erik Brynjolfsson and Andrew McAfee.

Opening IRS e-file data would add innovation and transparency to $1.6 trillion U.S. nonprofit sector

One of the most important open government data efforts in United States history came into being in 1993, when citizen archivist Carl Malamud used a small planning grant from the National Science Foundation to license data from the Securities and Exchange Commission, published the SEC data on the Internet and then operated it for two years. At the end of the grant, the SEC decided to make the EDGAR data available itself — albeit not without some significant prodding — and has continued to do so ever since. You can read the history behind putting periodic reports of public corporations online at Malamud’s website, public.resource.org.

Meals-on-Wheels-Reports

Two decades later, Malamud is working to make the law public, reform copyright, and free up government data again, buying, processing and publishing millions of public tax filings from nonprofits to the Internal Revenue Service. He has made the bulk data from these efforts available to the public and anyone else who wants to use it.

“This is exactly analogous to the SEC and the EDGAR database,” Malamud told me, in an phone interview last year. The trouble is that data has been deliberately dumbed down, he said. “If you make the data available, you will get innovation.”

Making millions of Form 990 returns free online is not a minor public service. Despite many nonprofits file their Form 990s electronically, the IRS does not publish the data. Rather, the government agency releases images of millions of returns formatted as .TIFF files onto multiple DVDs to people and companies willing and able to pay thousands of dollars for them. Services like Guidestar, for instance, acquire the data, convert it to PDFs and use it to provide information about nonprofits. (Registered users view the returns on their website.)

As Sam Roudman reported at TechPresident, Luke Rosiak, a senior watchdog reporter for the Washington Examiner, took the files Malamud published and made them more useful. Specifically, he used credits for processing that Amazon donated to participants in the 2013 National Day of Civic Hacking to make the .TIFF files text-searchable. Rosiak then set up CItizenAudit.org a new website that makes nonprofit transparency easy.

“This is useful information to track lobbying,” Malamud told me. “A state attorney general could just search for all nonprofits that received funds from a donor.”

Malamud estimates nearly 9% of jobs in the U.S. are in this sector. “This is an issue of capital allocation and market efficiency,” he said. “Who are the most efficient players? This is more than a CEO making too much money — it’s about ensuring that investments in nonprofits get a return.

Malamud’s open data is acting as a platform for innovation, much as legislation.gov.uk is the United Kingdom. The difference is that it’s the effort of a citizen that’s providing the open data, not the agency: Form 990 data is not on Data.gov.

Opening Form 990 data should be a no-brainer for an Obama administration that has taken historic steps to open government dataLiberating nonprofit sector data would provide useful transparency into a $1.6 trillion dollar sector for the U.S. economy.

After many letters to the White House and discussions with the IRS, however, Malamud filed suit against the IRS to release Form 990 data online this summer.

“I think inertia is behind the delay,” he told me, in our interview. “These are not the expense accounts of government employees. This is something much more fundamental about a $1.6 trillion dollar marketplace. It’s not about who gave money to a politician.”

When asked for comment, a spokesperson for the White House Office of Management and Budget said that the IRS “has been engaging on this topic with interested stakeholders” and that “the Administration’s Fiscal Year 2014 revenue proposals would let the IRS receive all Form 990 information electronically, allowing us to make all such data available in machine readable format.”

Today, Malamud sent a letter of complaint to Howard Shelanski, administrator of the Office of Information and Regulatory Affairs in the White House Office of Management and Budget, asking for a review of the pricing policies of the IRS after a significant increase year-over-year. Specifically, Malamud wrote that the IRS is violating the requirements of President Obama’s executive order on open data:

The current method of distribution is a clear violation of the President’s instructions to
move towards more open data formats, including the requirements of the May 9, 2013
Executive Order making “open and machine readable the new default for government
information.”

I believe the current pricing policies do not make any sense for a government
information dissemination service in this century, hence my request for your review.
There are also significant additional issues that the IRS refuses to address, including
substantial privacy problems with their database and a flat-our refusal to even
consider release of the Form 990 E-File data, a format that would greatly increase the
transparency and effectiveness of our non-profit marketplace and is required by law.

It’s not clear at all whether the continued pressure from Malamud, the obvious utility of CitizenAudit.org or the bipartisan budget deal that President Obama signed in December will push the IRS to freely release open government data about the nonprofit sector,

The furor last summer over the IRS investigating the status of conservative groups claimed tax-exempt status, however, could carry over into political pressure to reform. If political groups were tax-exempt and nonprofit e-file data were published about them, it would be possible for auditors, journalists and Congressional investigators to detect patterns. The IRS would need to be careful about scrubbing the data of personal information: last year, the IRS mistakenly exposed thousands of Social Security numbers when it posted 527 forms online — an issue that Malamud, as it turns out, discovered in an audit.

“This data is up there with EDGAR, in terms of its potential,” said Malamud. “There are lots of databases. Few are as vital to government at large. This is not just about jobs. It’s like not releasing patent data.”

If the IRS were to modernize its audit system, inspector generals could use automated predictive data analysis to find aberrations to flag for a human to examine, enabling government watchdogs and investigative journalists to potentially detect similar issues much earlier.

That level of data-driven transparency remains in the future. In the meantime, CitizenAudit.org is currently running on a server in Rosiak’s apartment.

Whether the IRS adopts it as the SEC did EDGAR remains to be seen.

[Image Credit: Meals on Wheels]