U.S. House unanimously votes in favor of FOIA reform and a more open government

Earlier tonight, The United States House of Representatives voted 410-0 to pass the FOIA Oversight and Implementation Act. If the FOIA Act passes through the Senate, the bill would represent the most important update to United States access to information laws in generations.

“Transparency in government is a critical part of restoring trust and the House will continue to work to make government more transparent and accessible to all Americans,” said House Majority Leader Eric Cantor (R-VI). “By expanding the FOIA process online, the FOIA Oversight and Implementation Act creates greater transparency and continues our open government efforts in the House.”

The FOIA Oversight and Implementation Act (FOIA), ‪‎H.R.1211‬, is one of the best opportunities to institutionalize open government in the 113th Congress, along with the DATA Act, which passed the House of Representatives 388-1 last November.

The FOIA reform bill now moves to the Senate, which passed unanimous FOIA reform legislation in the last Congress.

As Nate Jones detailed at the National Security Archive, the Senate’s own legislative effort to reform FOIA, the so-called the “Faster FOIA Act” (S.627S. 1466), was not picked up by the House: the open government bill was hijacked in service of a 2011 budget deal, where the FOIA provisions in it ultimately met an untimely end. Chairman Darrell Issa (R-CA.), Ranking Member Elijah Cummings (D-MD), and Representative Mike Quigley (D-IL) chose to draft their own bill instead of taking that bill up again.

Open government advocates applauded the unanimous passage of the FOIA Act, although there are some caveats about its provisions for the Senate to consider.

“This vote shows strong congressional support for government transparency and the Freedom of Information Act,” said Sean Moulton, Director of Open Government Policy at the Center for Effective Government, in a statement:

Since its original passage nearly 50 years ago, FOIA has been a cornerstone of the public’s right to know. By modernizing FOIA, H.R. 1211 would improve Americans’ ability to access public information and strengthen our democracy.

We thank the chair and ranking member of the House Committee on Oversight and Government Reform, Reps. Darrell Issa (R-CA) and Elijah Cummings (D-MD), who worked with the open government community to develop this legislation in a bipartisan fashion. We urge the Senate to advance legislation addressing these issues and other pressing FOIA reforms, including the need to rein in secrecy claims under Exemption 5, which restrict access to important information about government operations.

Access to public information is crucial to our democracy and the government’s effectiveness. It allows Americans to actively engage in policymaking in a thoughtful, informed manner and to hold public officials accountable for decisions that impact us all.

The bill represents important incremental, improvements to the FOIA process, but “it doesn’t address some fundamental shortfalls in the way that the FOIA is implemented and viewed within the Federal government,” wrote Matt Rumsey, policy analyst at the Sunlight Foundation:

… A “presumption of openness” and improved online infrastructure are important, but the bigger challenge will be getting agencies to change their posture away from one of non-disclosure and often aggressive litigation that is opposed to openness. … It clearly shows that ensuring public access to government information is not a partisan issue, or even one that should divide the branches of government. We hope to see the Senate take up legislation in the near future so that both chambers can work together to send a strong FOIA reform bill to President Obama’s desk for him to sign.

Passage of the House bill is a good first step but only a first step, wrote Anne Weismann, chief counsel of Citizens for Responsibility and Ethics in Washington:

Without a doubt these are needed reforms. As CREW has long advocated, however, meaningful FOIA reform must include changes in the FOIA’s exemptions to make the statute work as Congress intended.  All too often agencies hide behind Exemption 5 and its protection for privileged material to bar public access to documents that would reveal the rationale behind key government decisions.  For example, the Department of Justice denies every request for a legal opinion issued by DOJ’s Office of Legal Counsel that determines what a law means and what conduct it permits, claiming to reveal these opinions would harm the agency’s deliberative process.  This has led to the creation of a body of secret law — precisely what Congress sought to prevent when it enacted the FOIA.

To address this serious problem, CREW has advocated adding a balancing test to Exemption 5 that would require the agency and any reviewing court to balance the government’s claimed need for secrecy against the public interest in disclosure.  Other needed reforms include a requirement that agencies post online all documents disclosed under the FOIA.  The House bill, however, does not incorporate any of these reforms.

This post has been updated with additional statements over time.

AskThem.io launches to enable citizens to ask public officials anything

badgeToday, the Participatory Politics Foundation launched AskThem.io, a new online tool focused upon structured questions and answers with elected officials.

As David Moore, founder of PPF, put it, AskThem is like a version of the White House’s “We The People” petition platform, but for over 142,000 elected officials nationwide.” 

The platform is an evolution from earlier attempts to ask questions of candidates for public office, like “10 Questions” from Personal Democracy Media, or the myriad online town halls that governors and the White House have been holding for years. 

AskThem enables anyone to pose a question to any elected official or Verified Twitter account. Notably, the cleanly designed Web app uses geolocation to enable users to learn who represents them, in of itself a valuable service.

As with e-petitions, AskThem users can then sign questions they support, voting them up and sharing the questions with their social networks. When a given question hits a preset threshold, the platform delivers the questions to to the public figure and “encourages a public response.”

That last bit is key: there’s no requirement for someone to respond, for the response itself to be substantive, nor for the public figure to act. There’s only the network effect of public pressure to make any of that happen.

After a year of development, Moore was excited to see the platform go live today, noting a number of precedents set in the process.

“I believe we’re the first open-source web app to support geolocation of elected officials, down to the municipal level, from street address,” he said, via email. “And I believe we’re the first to offer access to over 142,000 elected officials through our combined data sources. And I believe we’re the first to incorporate open government data for informed questions of elected officials at every level of government.”

Moore referred to AskThem’s use of the Google Civic Information API, which provides the data for the platform.

AskThem goes online just in time for tomorrow’s day of action against mass surveillance, where over 5,000 websites will try to activate their users to contact their elected representatives in Washington. Whether it gets much use or not will depend on awareness of the new tool.

That could come through use by high-profile early adopters like Chris Hayes (@chrislhayes), of MSNBC’s “All In with Chris Hayes,” or OK Go, the popular band.

Chris_Hayes_AskThem_TOtE_sampleQ

 

At launch,  66 elected officials nationwide have signed on to participate, though more may join if it catches on. In the meantime, you can use AskThem’s handy map to find local elected officials and see a listing of all of the questions to date across the USA — or pose your own.

 

With major pharmacies on board, is the Blue Button about to scale nation-wide?

blue_button_for_homepage1The Obama administration announced significant adoption for the Blue Button in the private sector today. In a post at the White House Office of Science and Technology blog, Nick Sinai, U.S. deputy chief technology officer and Adam Dole, a Presidential Innovation Fellow at the U.S. Department of Health and Human Services, listed major pharmacies and retailers joining the Blue Button initiative, which enables people to download a personal health record in an open, machine-readable electronic format:

“These commitments from some of the Nation’s largest retail pharmacy chains and associations promise to provide a growing number of patients with easy and secure access to their own personal pharmacy prescription history and allow them to check their medication history for accuracy, access prescription lists from multiple doctors, and securely share this information with their healthcare providers,” they wrote.

“As companies move towards standard formats and the ability to securely transmit this information electronically, Americans will be able to use their pharmacy records with new innovative software applications and services that can improve medication adherence, reduce dosing errors, prevent adverse drug interactions, and save lives. ”

While I referred to the Blue Button obliquely at ReadWrite almost two years ago and in many other stories, I can’t help but wish that I’d finished my feature for Radar a year ago and written up a full analytical report. Extending access to a downloadable personal health record to millions of Americans has been an important, steadily shift that has largely gone unappreciated, despite reporting like Ina Fried’s regarding veterans getting downloadable health information.  According to the Office of the National Coordinator for Health IT, “more than 5.4 million veterans have now downloaded their Blue Button data and more than 500 companies and organizations in the private-sector have pledged to support it.

As I’ve said before, data standards are the railway gauges of the 21st century. When they’re agreed upon and built out, remarkable things can happen. This is one of those public-private initiatives that has taken years to take fruit that stands to substantially improve the lives of so many people. This one started with something simple, when the administration gave military veterans the ability to download their own health records using from on MyMedicare.gov and MyHealthyVet and scaled progressively to Medicare recipients and then Aetna and other players from there.

There have been bumps and bruises along with the way, from issues with the standard to concerns about lost devices, but this news of adoption by places like CVS suggests the Blue Button is about to go mainstream in a big way. According to the White House, “more than 150 million Americans today are able to use Blue Button-enabled tools to access their own health information from a variety of sources including healthcare providers, health insurance companies, medical labs, and state health information networks.”

Notably, HHS has ruled that doctors and clinics that implement the new “BlueButton+” specification will be meeting the requirements of “View, Download, and Transmit (V/D/T)” in Meaningful Use Stage 2 for electronic health records under the HITECH Act, meaning they can apply for reimbursement. According to ONC, that MU program currently includes half of eligible physicians and more than 80 percent of hospitals in the United States. With that carrot, many more Americans should expect to see a Blue Button in the doctor’s office soon.

In the video below, U.S. chief technology officer Todd Park speaks with me about the Blue Button and the work of Dole and other presidential innovation fellows on the project.

U.S. CIO Steven VanRoekel on the risks and potential of open data and digital government

Last year, I conducted an in-depth interview with United States chief information officer Steven VanRoekel in his office in the Eisenhower Executive Office Building, overlooking the White House. I was there to talk about the historic open data executive order that President Obama had signed in May 2013. vanroekel On this visit, I couldn’t help but notice that VanRoekel has a Star Wars clock in his office.  The Force is strong here. The US CIO also had a lot of other consumer technology around his workspace: a MacBook and Windows laptop and dock, dual monitors, iPad, a teleconferencing system integrated with a desktop PC, and an iPhone, which recently became securely permissible on in the White House IT system in a “bring your own device” pilot. The interview that follows is slightly dated, in certain respects, but still offers significant insight into how the nation’s top IT executive is thinking about digital government, open data and more. It has also been lightly edited, primarily removing the long-winded questions of the interviewer.

We’re at the one year mark of the Digital Government Strategy. Where do we stand with hitting the metrics in the strategy? Why did it take until now to get this out?

VanRoekel: The strategy calls for the launch of the policy itself. Throughout the year, the policy was a framework for a 12 month set of deliverables of different aspects, from the work we’re doing in mobile, from ‘bring your own device,’ to security baselines and mobile device management platforms. Not only streamlining procurement, streamlining app development in government. Managing those devices securely to thinking about the way we do customer service and the way we think about the power of data and how it plays into all of this. It’s been part of that process for about the year we’ve been working on it. Of course, we thought through these principles and have been working on data-related aspects for longer. The digital strategy policy was the framework for us to catalyze and accelerate that, and over the course of the year, the stuff that’s been going on behind the scenes has largely been working with agencies on building some of this capability around open data. You’re going to see some things happening very soon on the release of some of this capability. Second, standing up the Presidential Innovation Fellows program and then putting specific ‘PIFs’ into certain targeted agencies to fast track their opening of data — that’s going to extend into Wave Two. You’re going to see that continuing to happen, where we just take these principles and just kind of ‘rinse and repeat’ in government. Third, we’re working with a small set of the community to build tools to make it easy for agencies to implement these guidelines. So if there’s an agency that doesn’t know how to create a JSON file, that tool is on Github. You can see that on Project Open Data .

How involved has the president been in this executive order? It’s his name, his words are in there — how much have you and U.S. chief technology officer Todd Park talked with the president about this?

VanRoekel: Ever since about last summer, we’ve been talking to the president about open data, specifically. I think there’s lots of examples where we’ve had conversations on the periphery, and he’s met with a lot of tech leaders and others around the country that in many, many cases have either built their business or are relying upon some government service or data stream. We’re seeing that culminating into the mindset of what we do as a factor of economic growth. His thoughts are ‘how do we unlock this national resource?’ We’re sitting on this treasure trove – how do we unleash it into the developer community, so that these app developers can build these different solutions?’ He’s definitely inspired – he wrote that cover memo to the digital strategy last May – and then we’ve had all of these different meetings, across the course of the year, and now it culminates into this executive order, where we’re working to catalyze these agencies and get them to pay attention and follow up.

We’ve been down this road before, in some respects, with the Open Government Directive in 2009, with former US CIO Vivek Kundra putting forward claims of positive outcomes from releasing data. Yet, what have we learned over the past four years? What makes this different? Where’s the “how,” in terms of implementing this?

VanRoekel: The original launch of data.gov was, I think, a way of really shocking the system, and getting people to pay attention to and notice that there was an important resource we’re sitting on called data. Prior to data.gov, and prior to the work of this administration, the daily approach to data was very ad hoc. It wasn’t taken as data, it was just an output or a piece of a broader mix. That’s why you get so much disparity in the approach to the way we manage data. You get the paper-driven processes that are still very prevalent, where someone will send a paper document, and someone will sign it, and scan it, feed it into a system, and then eventually print it and mail it. It’s crazy what you end up seeing and experiencing inside of government in terms of how these things work. Data.gov was an important first step. The difference now is really around taking this approach to everything that we do. The work that we did with the Open Government Directive back in 2009 was really about taking some high value data sets and putting them up on Data.gov. What you ended up seeing was kind of a ‘bulk upload, bulk download,’ kind of access to the data. Machine-readability and programmability wasn’t taken into account, or the searchability and findability.

Did entrepreneurs or advocates validate these data sets as “high value?” Entrepreneurs have kept buying data from government over the past four years or making Freedom of Information Act requests for data from government or scraping data. They’re not getting that from Data.gov.

VanRoekel: I have no official way of measuring the ‘value’ of the data, other than anecdotal conversations. I do think that the motion of getting people to wake up and think about how they are treating data internally within in an organization – well, there was a convenience factor to that, which basically was that ‘I got to pick what data I release,’ which probably dates from ‘what data I have that’s releasable?’ The different tiers to this executive order and this policy are a huge part of why it’s different. It sets the new default. It basically says, if you are modernizing a system or creating a new system, you can do that in a way that adopts these principles. If you [undertake] the collection, use and dissemination of data, you’ll make those machine-readable and interoperable by default. That doesn’t always mean public, because there are applications that privacy and national security mean we should make public, but those principles still hold, in terms of the way I believe we the ways we build things should evolve on this foundation. For the community that’s getting value outside of the government, this really sets a predictable, consistent course for the government to open up data. Any business decisions are risk-based decisions. You have to assume some level of risk with anything you do.

If there’s too much risk, entrepreneurs won’t do it.

VanRoekel: True. To that end, the work we’ve done in this policy that’s different than before is the way we’re collecting information about the data is being standardized. We’re creating a meta data infrastructure. Data itself doesn’t have to be all described in the same way. We’re not coming up with “one schema to rule them all” across government. The complexity of that would be insurmountable. I don’t think that’s a 21st century approach. That’s probably a last century thinking around to say that if we get one schema, we’re going to get it all done. The meta data approach is to say let’s collect a standard template way of describing – but flexible for future extension – the data that is contained in government. In that description, and in that meta data, tags like “who owns this data” and “how often is the data updated,” information about how to get a hold of people to find out more about descriptions within the data. They will be a part of that description in a way that gives you some level of assurance on how the data is managed. Much of the data we have out there, there’s existing laws on the books to collect the data. Most of it, there’s existing laws, not just a business process. One of the great conversations we’re having with the agencies is that they find greater efficiency in the way they collect data and build solutions based upon these open data principles.

I received a question from David Robinson, regarding open licensing in this policy. Isn’t U.S. government data exempt from copyright?

VanRoekel: Not all government data is exempt from copyright, but those are generally edge cases. The Smithsonian takes pictures of things that are still under copyright, for instance. That’s government data. I sent a note about this announcement to the Secretary of the Smithsonian this morning. I’ve been talking to him about opening up data for some time. The nuance there, about open licenses, is really around the types of systems that create the data, and putting a preference for a non-proprietary format. You can imagine a world in which I give you an XML file, and I give you a Microsoft Excel file. Those are both piece of data. To some extent, the Excel format is machine-readable. You can open it up and look at it internally just the way it is, but do you have to go buy a special piece of software to read the file or not? That kind of denotes the open[ness] and accessibility of the data. In the case of policy, we declare a strong preference towards these non-proprietary formats, so that not only do you get machine-readability but you get the broadest access to the data. It’s less about the content in there – is that’s copyrighted or not — I think most data in government, outside of the realm of confidential or private data, is not copyrighted, so to speak from the standpoint of the license. It’s more about the format, and if there’s a proprietary standard wrapped in the stuff. We have an obligation as a government to pick formats, pick solutions, et cetera that not only have the broadest applicability and accessibility for the public but also create the most opportunity in the broadest sense.

Open data doesn’t come without costs. Is this open data policy an unfunded mandate on all of the agencies, instructing them to put all of the data online they can, to digitize content?

VanRoekel: In the broadest sense, the phrase ‘the new default’ is an important one. It basically says, for enhancements to existing systems or new systems, follow this guideline. If people are making changes, this is in the list of requirements. From a costing perspective, it’s pre-baked into the cost of any enhancement or release. That’s the broad statement. The narrow statement is that there are many agencies out there, increasing every day, that are embracing these retroactive open data approaches, saying that there is value to the outside world, there is lower cost, greater interoperability, there are solutions that can be derived from taking these open data approaches inside of my own organization. That’s what we saw in PIF [Presidential Innovation Fellows] round one, where these agencies adopted the innovations fellows to unlock their data. That’s increasing and expanding in round two, and continuing in the agencies which we thought were high administration priorities, along with others. I think we’re going to continue to see this as a catalyzing element of that phenomenon, where people are going to back and spend the resources on doing this. Just invite any of these leaders to the last twenty minutes of a hackathon, where folks are standing up and showing their solutions that they developed in one day, based on the principles of open data and APIs. They just are overwhelmed about the potential within their own organizations, and they run back and want to do this as fast as they can.

Are you using anything that has ever been developed at a hackathon, personally or professionally?

VanRoekel: We are incorporating code from the “We The People” hackathon, the most recent one. I know Macon Phillips and team are looking at incorporating feature sets they got out of that. An important part of the hackathon, like most conferences you go to, is the time between the sessions. They’re the most important – the relationship building aspect, figuring out how we shape the next set of capabilities or APIs or other things you want to build.

How does this relate to the way that the federal government uses open data internally?

VanRoekel: There are so many examples of government agencies, when faced with a technical problem, will go hire a single monolithic vendor to do a single, monolithic solution – and spend most of the budget on the planning cycle – and you end up with these multi-million dollar, 3-ring binders that ultimately fail because technology has moved on or people have left or laws have moved on five or ten years later, after they started these projects. One of the key components of this is laying foundational stones down to say how are we going to build upon that, to create the apps and solutions of the future. You know, I can swoop in and say “here’s how to do modular contracting in the context of government acquisition” – but unless you say, you’ve got to adopt open data and these principles of API-first, of doing things a different way — smaller, reusable, interoperable pieces – you can really build the phenomenon. These are all elements of that – and the cost savings aspect of it are extraordinary. The risk profile is going to be a lot smaller. Inside government I’m as excited about as outside.

Do you think the federal government will ever be able to move from big data centers and complicated enterprise software to a lightweight, distributed model for mobile services built on APIs?

VanRoekel: I think there is massive potential for things like that across the whole of government. I mean, we’re a big organization. We’re the largest buyer of technology in the world. We have unending opportunities to do things in a more efficient way. I’ve been running this process that I launched last year called Portfolio Stat. It’s all about taking a left to right look, sitting down with agencies. What I’ve always been missing from those is some of these groundbreaking policies that start to paint the picture for what the ideal is, and how to get your job done in a way that’s different than the way you’ve don’t it before, like the notion of continuous improvement. We’ve needed things like the EO to give us those conversation starters to say, here’s the way to do it, see what they are doing over at HHS. “How are you going to bring that kind of discipline into your organization?” I’m sitting down with every deputy secretary and all the C-level executives to have those tough conversations. Fruitful, but good conversations about how we are going to change the way we deliver solutions inside of government. The ideal state that they’ll all hear about is the service-oriented model with centralized, commodity computing that’s mostly cloud-based. Then, how do you provide services out to the periphery of your organization.

You told me in our last interview that you had statutory authority to make things happen. What happens if a federal CIO drags his or her feet and, a year from now, you’re still here and they’re not moving on these policies, from cloud to open data?

VanRoekel: The answer I gave to you last time still holds: it’s about inspire and push. Inspire comes in many factors. One is me coming in and showing them the art of the possible, saying there’s a better way of doing this, getting their customers to show up at the door to say that we want better capabilities and get them inspired to do things, getting their leadership to show up and say we want better things. Push is about budget – how do you manage their budget. There’s aspects of both inspire and push in the way we’ve managed the budget this year. I have the authority to do that.

What’s your best case for adopting an open data strategy and enterprise data inventory, if you’re trying to inspire?

VanRoekel: The bottom line is meet your mission faster and at a much lower cost. Our job is not about technology as an end state – it’s about our mission. We’ve got to get the mission of government done. You’re fostering immigration, you’re protecting public safety, you’re providing better energy guidance, you’re shaping an industry for the country. Open data is a fundamental building block of providing flexibility and reusability into the workplace. It’s what you do to get you to the end state of your mission. I hearken back a lot to the examples we used at the FCC, which was moving from like fourteen websites to one and how we managed that. How do we take workload of a place so that the effort pays for itself in six months and start yielding benefits beyond that? The benefits are long-term. When you build that next enhancement, or that new thing on top of it, you can realize the benefits at lower cost. It’s amazing. I do these TechStat processes, where I sit down with the agencies. They have some project that’s going off the rails. They need help, focus, and some executive oversight. I sit down, usually in a big room of people, and it’s almost gotten to the point where you don’t need to look at the briefing documents ahead of time. You sit down and say, I bet you’re doing it this way – and it’s monolithic, proprietary, probably taking a lot of packaged software and writing a lot of glue code to hold it all together – and you then propose to them the principles of open data and open approaches to doing the solution, and tell them I want to see in the next sixty days some customer-facing, benefit value that’s built on this model. They go off and do that, and they get right back on the tracks and they succeed. Time after time when we do TechStat, that’s the formula and it’s yielded these incredible results. That culture is starting to permeate into how we get stuff done, because they see how it might accomplish their mission if they just turn 45 degrees and try a different approach. If that makes them successful, they will go there every time.

Critiques of open data raise concerns about “discretionary disclosure,” where a government entity releases what it wants, claim credit for embracing open government, and obfuscates the rest of the data. Does this policy change any of the decisions that are being made to delay, redact or not release requested data?

VanRoekel: I think today marks an inflection point that will set a course for the future. It’s not that tomorrow or next month or next year that all government data will just be transformed into open, machine-readable form. It will happen over time. The key here is that we’ve created mechanisms to protect privacy and security of data but built in culture where that which is intended to be public should be made public. Part of what is described in the executive order is the formation of this cross-agency executive group that will define a cross-agency priority goal, that we need to get inventories in from agencies regarding that which they hold that could be made public. We want to know stuff that’s not public today, what could be out there. We’re going to take that in and look at how we can set goals for this year, the next year and the year after that to continue to open up data at a faster pace than we’ve been doing in the past. The modernization act and some of the work around setting goals in government is much more compatible and looks a lot like the private sector. We’re embracing these notions that I’ve really grown to love and respect over the course of my private sector career in government around methodologies. Stay tuned on the capital and what that looks like.

Are you all going to work with the House and Senate on the DATA Act or are statutory issues on oversight still a stumbling block?

VanRoekel: The spirit of the DATA Act, of transparency and openness, are the things we’re doing, and I think are embraced. Some of the tactical aspects of the act were a little off the mark, in terms of getting to the end state that we want to get to. If you look at the FY-14 budget and the work we’ve done on transferring USASpending.gov to Treasury to get it closer to the source of the data, plus a view into how those systems get modernized, how we bring these principles into that mix, that will all be a part of the end state, which is how we track the spending.

Do you ever anticipate the data going into FOIA.gov also going into Data.gov?

VanRoekel: I don’t know. I can’t speculate on that. I’m not close enough to it.

Well, FOIA requests show demand. Do you have any sense of what people are paying for now, in terms of government data?

VanRoekel: I don’t.

Has anybody ever asked, to try to figure that out?

VanRoekel: I think that would be a great thing for you to do.

I appreciate that, but this strikes me as an interesting assessment that you could be doing, in terms of measuring outflows for business intelligence. If someone buys data, it shows that there is value in it. What would it mean if releases reflected that signal?

VanRoekel: You mean preference data that is being purchased?

Right.

VanRoekel: Well, part of this will be building and looking at Data.gov. Some of the stuff coming there is really building community around the data. The number one question Todd Park and I had coming out of the PIF program, at the end of May [2013] was, what if I think there’s data, but I don’t know, who do I contact? An important part of the delivery of this wave and the product coming out as part of this policy is going to be this enhanced Data.gov, that’s our intention to build a much richer community around government data. We want to hear from people. If there are data sources that do hold promise and value, let’s hear about those and see if there are things we can do to get a PIF on structuring it, and get agencies to modernize systems to get it released and open. I know some of the costs are like administrative feeds for printing or finding the data, something that’s related to third parties collecting it and then reselling it. We want to make sure that we’re thoughtful in how we approach that.

How has the experience that you’ve seen everyone have with the first iteration of Data.gov informed the nation’s open data strategy today? What specifically had not been done before that you will be doing now?

VanRoekel: The first Data.gov set us on a cultural path.What it didn’t do was connect you to data the source. What is this data? How often is it updated? Findability and searchability of broad government data wasn’t there. Programmability of the data wasn’t necessarily there. Data.gov, in the future, instead of being a repository for data, a place to upload the data, my intention is that it will become a meta data catalog. It will be the place you go, the one-stop-shop, to find government data, across multiple aspects. The way we’re doing this is through the policy itself, which says that agencies have to go and set up this new page, similar to what is now standard in open government, /open, /developer. In that page, the most important part of that page is a JSON file. That’s what data.gov can go out and crawl, or any developer outside can go out and crawl, to find out when data has been updated, what data is available, in what format. All of the standard meta data that I’ve described earlier will be represented through that JSON file. Data.gov will then become a meta data catalog of all the open data out in government at its source. As a developer, you’d come in, and it you wanted to do a map, for instance, to see what broadband capabilities exist near low-income Americans and then overlay locations of educational institutions, if you wanted to look for a correlation between income and broadband deployment and education, you’d hypothetically be looking for 3 different data sources, from 3 different agencies. You’d be able to find the open data streams, the APIs, to go get that data in one place, and then you’d have a connection back to the mothership to be able to grab it, find out who owns it. We want to still have a center of gravity for data, but make the data itself follow these principles, in terms of discoverability and use. The thing that probably got me most pointed in this direction is the President’s Council of Advisors on Science and Technology (PCAST), which did a report on health IT. Buried on page 60 or something, it had this description of meta data as the linchpin of discoverability of diverse data sources. That’s the approach we’ve taken, much like Google.

5 years from now, what will have changed because of this effort?

VanRoekel: The way we build solutions inside of government is going to change, and the amount of apps and solutions outside of government are going to fundamentally change. You and I now, sitting in our cars, take for granted the GPS signal going to the device on the dash. I think about government. Government is right there with me, every single day, as I’m driving my car, or when I do a Foursquare check-in on my phone. We’ll be bringing government data to citizens where they are, versus making people come to government. It’s been a long time since the mid-80s, when we opened up GPS, but look at where we are today. I think we’ll look back in 10 or 15 years and think about all of the potential we unlocked today.

What data could be like GPS, in terms of their impact on our lives?

VanRoekel: I think health and energy are probably two big ones.

POSTSCRIPT

Since we talked, the Obama administration has followed through on some of the commitments the U.S. CIO described, including relaunching Data.gov and releasing more data. Other goals, like every agency releasing an enterprise data inventory or publishing a /data and /developer page online, have seen mixed compliance, as an audit by the Sunlight Foundation showed in December. The federal government shutdown last fall also scuttled open data access, where certain data types were deemed essential to maintain and others were not. The shutdown also suggested that an “API-first” strategy for open data might be problematic. OMB, where VanRoekel works, has also quietly called for major changes in the DATA Act, which passed the House of Representatives with overwhelming support at the end of last year. A marked up version of the DATA Act obtained by Federal News Radio removes funding for the legislation and language that would require standardized data elements for reporting federal government spending. The news was not received well on Capitol Hill. Sen. Mark Warner, D-Va., the lead sponsor of the DATA Act in the Senate, reaffirmed his commitment to the current version of the bill in statement: “The Obama administration talks a lot about transparency, but these comments reflect a clear attempt to gut the DATA Act. DATA reflects years of bipartisan, bicameral work, and to propose substantial, unproductive changes this late in the game is unacceptable. We look forward to passing the DATA Act, which had near universal support in its House passage and passed unanimously out of its Senate committee. I will not back down from a bill that holds the government accountable and provides taxpayers the transparency they deserve.” The leaked markup has led to observers wondering whether the White House wants to scuttle the DATA Act and others to potentially withdraw support. “OMB’s version of the DATA Act is not a bill that the Sunlight Foundation can support,” wrote Matt Rumsey, a policy analyst at the Sunlight Foundation. “If OMB’s suggestions are ultimately added to the legislation, we will join our friends at the Data Transparency Coalition and withdraw our support of the DATA Act.” In response to repeated questions about the leaked draft, the OMB press office has sent the same statement to multiple media outlets: “The Administration believes data transparency is a critical element to good government, and we share the goal of advancing transparency and accountability of Federal spending. We will continue to work with Congress and other stakeholders to identify the most effective & efficient use of taxpayer dollars to accomplish this goal.” I have asked the Office of Management and Budget (OMB) about all of these issues and will publish any reply I receive separately, with a link from this post.

Will Google Glass enable “augmented advocacy” in a more transparent society?

One of the more interesting aspects of Dave Eggers’ dystopic new novel, “The Circle,” is the introduction of the “SeaChange,” a small, powerful camera that can transmit wireless images to a networked global audience. The SeaChange is adopted by politicians who “go transparent,” broadcasting all of their interactions to the public all day long.

Regardless of whether that degree of radical transparency in beneficial for elected representatives or not, in early 2014, we’ve now seen many early glimpses of what a more networked world full of inexpensive cameras looks like when United States politicians are online and on camera more often, from scandals to threats to slurs to charged comments that may have changed a presidential election. Most of that video has been captured by small video cameras or, increasing, powerful smartphones. Over the next year, more people will be wearing Google Glass, Google’s powerful facial computing device. Even if Google Glass has led to a backlash, the next wave of mobile devices will be wearable, integrated into clothing, wristbands, shoes and other gear. This vision of the future is fast approaching, which means that looking for early signals of various aspects of it is crucial.

glass_promotions-01 (1)

One such signal came across my desktop earlier this week, in the form of a new app for Google Glass from RedEdge, a digital advocacy consultancy based in Arlington, Virginia. Their new “augmented advocacy” application for Google Glass is a proof of concept that demonstrates how government data can be served to someone wearing glass as she moves around the world. It’s not in the GDJ Store but people interested in testing it can request the Glass application file (android 1) from RedEdge, its maker.

“While we don’t expect widespread deployment of this app, though that would be cool, this is a window into what’s possible with wearable computing just using federal department data,” said Ian Spencer, chief technology officer of RedEdge, in an interview. “The data we used to launch this app and populate the database was all sourced from publicly available information. We primarily used publications from the Office of Management and Budget for budget figures, as well as the president’s own budget, for monetary data. Location data on federal buildings was sourced from Google Maps.”

The app leverages Google Glass’s ability to detect the wearer’s location, feeding a government data through RedEdge’s API to populate a relevant card. It pulls in from open data, formatted as JSON, and provides a list of all locations.

“You can just walk around with the app running in background,” said Spencer. “It doesn’t take up a ton of battery life. With geofencing, Glass knows when you’re near a building and triggers the app, which pops in a card that shows you a phone number and budget information. You can then tap to get more information and it loads up public contact information. Eventually the GDK [Glass Developer Kit] will let you make calls and emails.”

Visitors to the White House with this app, for instance, could call the White House switchboard, though they would be unlikely to get President Obama on the phone.
Whitehouse

The RedEdge app is currently limited by the amount of time and investment RedEdge has put into it, along with the technology of Glass itself. “Once we add more data points, we will need a more complicated API,” said Spencer. “User experience was our focus, not massive complete sets. Even if we were using a government API, which would be ideal at some point, we would need a hashing layer so that we don’t overwhelm their servers.”

The only data the developers are feeding into it is the total federal budget for a given agency, not more granular details concerning how it related to programs, their performance or who is in charge of them. It’s very much a “proof of concept.”

“We’re looking at it as a trial balloon,” said Spencer. “It started with our tech team. We haven’t had researchers go over tons of entries. If there is interest in it, we then may do more, like adding more federal data and state-level data.”

One potentially interesting application of augmented advocacy might seem to be Congress, where data from the Sunlight Foundation’s Influence Explorer or Open Congress could be integrated as the Glass wearer walked around. The technical limitations of Glass, however, mean that citizens will need to keep downloading Sunlight’s popular Congress app for smartphones.

“The problem is the precision of the GPS,” said Spencer. “If you’re wearing Glass in the Hart building, you don’t have enough accuracy. You can get building-to-building precision, but not more. There are technical problems with trying to use satellites for this, whether it’s GPS or GLONASS, the Russian version.”

That doesn’t mean such precision might not be possible in the future. As Spencer highlighted, app developers can determine “micropositioning” through wifi or Bluetooth, enabling triangulation within a room. “A classic example comes from marketing in a store –” I see you’re looking at X,” he said.

That technology is already live, as Brian Fung reported in the Washington Post: stores are using cellphones to track shopping habits. In Washington, a more palatable  example might be around the Mall, where geofences and tracking trigger information about Smithsonian paintings, trees, statuary, or monuments.
IRS
The limitation on facial recognition capabilities in Glass also means that the most interesting and disturbing potential application of its gaze is still far away: looking at someone in a lobby, bar, hearing or conference and learning not only who the person is but what role he or she may play in DC’s complicated ecosystem of lobbyists, journalists, Congressional staffers, politicians, media, officials, public advocates and campaign operatives. (For now, the role of the trusted aide, whispering brief identifiers into the ears of the powerful is safe.)

When more apps like this go live in more devices, expect some fireworks to ensure around the United States and the world, as more private and semi-public spaces become recorded. Glass and its descendents will provide evidence of misbehavior by law enforcement, just as cellphones have in recent years. The cameras will be on the faces of officers, as well. While some studies suggest that police wearing cameras may improve the quality of their policing — and civil liberties advocates support their introduction — such devices aren’t popular with the New York City Police Department.

As with the dashboard cameras that supply much of the footage for “Cops” in the United States and offer some protection against corrupt police and fraud in Russia, wearable cameras look likely to end up on the helmets, glasses, lapels or shoulders of many officers in the future, from Los Angeles to London.

The aspirational view of this demo is that it will show how it’s possible to integrate more public data into the life of a citizen without requiring her to pull out a phone.

“There’s a lot of potential for this app to get people to care about an issue and take action,” said Spencer. “It’s about getting people aware. The cool thing about this is its passive nature. You start it once and it tells you when you’re near something.”
Treasury
A more dystopian view is that people will see a huge budget number and call the switchboard of a given agency to angrily complain, as opposed to the constituent relations staff of their representatives in Congress.

Given the challenges that Congress already faces with the tidal wave of social media and email that has swelled up over the last decade, that would be unhelpful at best. If future digital advocates want to make the most of such tools, they’ll need to provide users with context for the data they’re being fed, from sources to more information about the issues themselves the progress of existing campaigns.

This initial foray is, after all, just a demo. More integration may be coming in the next generation of wearables.

Opening IRS e-file data would add innovation and transparency to $1.6 trillion U.S. nonprofit sector

One of the most important open government data efforts in United States history came into being in 1993, when citizen archivist Carl Malamud used a small planning grant from the National Science Foundation to license data from the Securities and Exchange Commission, published the SEC data on the Internet and then operated it for two years. At the end of the grant, the SEC decided to make the EDGAR data available itself — albeit not without some significant prodding — and has continued to do so ever since. You can read the history behind putting periodic reports of public corporations online at Malamud’s website, public.resource.org.

Meals-on-Wheels-Reports

Two decades later, Malamud is working to make the law public, reform copyright, and free up government data again, buying, processing and publishing millions of public tax filings from nonprofits to the Internal Revenue Service. He has made the bulk data from these efforts available to the public and anyone else who wants to use it.

“This is exactly analogous to the SEC and the EDGAR database,” Malamud told me, in an phone interview last year. The trouble is that data has been deliberately dumbed down, he said. “If you make the data available, you will get innovation.”

Making millions of Form 990 returns free online is not a minor public service. Despite many nonprofits file their Form 990s electronically, the IRS does not publish the data. Rather, the government agency releases images of millions of returns formatted as .TIFF files onto multiple DVDs to people and companies willing and able to pay thousands of dollars for them. Services like Guidestar, for instance, acquire the data, convert it to PDFs and use it to provide information about nonprofits. (Registered users view the returns on their website.)

As Sam Roudman reported at TechPresident, Luke Rosiak, a senior watchdog reporter for the Washington Examiner, took the files Malamud published and made them more useful. Specifically, he used credits for processing that Amazon donated to participants in the 2013 National Day of Civic Hacking to make the .TIFF files text-searchable. Rosiak then set up CItizenAudit.org a new website that makes nonprofit transparency easy.

“This is useful information to track lobbying,” Malamud told me. “A state attorney general could just search for all nonprofits that received funds from a donor.”

Malamud estimates nearly 9% of jobs in the U.S. are in this sector. “This is an issue of capital allocation and market efficiency,” he said. “Who are the most efficient players? This is more than a CEO making too much money — it’s about ensuring that investments in nonprofits get a return.

Malamud’s open data is acting as a platform for innovation, much as legislation.gov.uk is the United Kingdom. The difference is that it’s the effort of a citizen that’s providing the open data, not the agency: Form 990 data is not on Data.gov.

Opening Form 990 data should be a no-brainer for an Obama administration that has taken historic steps to open government dataLiberating nonprofit sector data would provide useful transparency into a $1.6 trillion dollar sector for the U.S. economy.

After many letters to the White House and discussions with the IRS, however, Malamud filed suit against the IRS to release Form 990 data online this summer.

“I think inertia is behind the delay,” he told me, in our interview. “These are not the expense accounts of government employees. This is something much more fundamental about a $1.6 trillion dollar marketplace. It’s not about who gave money to a politician.”

When asked for comment, a spokesperson for the White House Office of Management and Budget said that the IRS “has been engaging on this topic with interested stakeholders” and that “the Administration’s Fiscal Year 2014 revenue proposals would let the IRS receive all Form 990 information electronically, allowing us to make all such data available in machine readable format.”

Today, Malamud sent a letter of complaint to Howard Shelanski, administrator of the Office of Information and Regulatory Affairs in the White House Office of Management and Budget, asking for a review of the pricing policies of the IRS after a significant increase year-over-year. Specifically, Malamud wrote that the IRS is violating the requirements of President Obama’s executive order on open data:

The current method of distribution is a clear violation of the President’s instructions to
move towards more open data formats, including the requirements of the May 9, 2013
Executive Order making “open and machine readable the new default for government
information.”

I believe the current pricing policies do not make any sense for a government
information dissemination service in this century, hence my request for your review.
There are also significant additional issues that the IRS refuses to address, including
substantial privacy problems with their database and a flat-our refusal to even
consider release of the Form 990 E-File data, a format that would greatly increase the
transparency and effectiveness of our non-profit marketplace and is required by law.

It’s not clear at all whether the continued pressure from Malamud, the obvious utility of CitizenAudit.org or the bipartisan budget deal that President Obama signed in December will push the IRS to freely release open government data about the nonprofit sector,

The furor last summer over the IRS investigating the status of conservative groups claimed tax-exempt status, however, could carry over into political pressure to reform. If political groups were tax-exempt and nonprofit e-file data were published about them, it would be possible for auditors, journalists and Congressional investigators to detect patterns. The IRS would need to be careful about scrubbing the data of personal information: last year, the IRS mistakenly exposed thousands of Social Security numbers when it posted 527 forms online — an issue that Malamud, as it turns out, discovered in an audit.

“This data is up there with EDGAR, in terms of its potential,” said Malamud. “There are lots of databases. Few are as vital to government at large. This is not just about jobs. It’s like not releasing patent data.”

If the IRS were to modernize its audit system, inspector generals could use automated predictive data analysis to find aberrations to flag for a human to examine, enabling government watchdogs and investigative journalists to potentially detect similar issues much earlier.

That level of data-driven transparency remains in the future. In the meantime, CitizenAudit.org is currently running on a server in Rosiak’s apartment.

Whether the IRS adopts it as the SEC did EDGAR remains to be seen.

[Image Credit: Meals on Wheels]

U.K. National Archives makes ‘good law’ online, builds upon open data as a platform

uk-justice-ministry

This September, I visited the United Kingdom’s Ministry of Justice and looked at the last remaining section of the Magna Carta that remains in effect. I was not, however, in a climate-controlled reading room, looking at a parchment or sheepskin.

uk-justice-ministry-glass

Rather, I was sitting in the Ministry’s sunny atrium, where John Sheridan was showing me the latest version of the seminal legal document, now living on online, on his laptop screen. The remaining section that is in force is rather important to Western civilization and the rule of law as many citizens in democracies now experience it:

NO Freeman shall be taken or imprisoned, or be disseised of his Freehold, or Liberties, or free Customs, or be outlawed, or exiled, or any other wise destroyed; nor will We not pass upon him, nor [X1condemn him,] but by lawful judgment of his Peers, or by the Law of the Land. We will sell to no man, we will not deny or defer to any man either Justice or Right.

9919241003_ffd39b552d_b

From due process to eminent domain to a right to a jury trial, many of the rights that American or British citizens take as a given today have their basis in the English common law that stems from this document.

I’d first met Sheridan virtually, back in August 2010, when I talked with the head of e-services and strategy at the United Kingdom’s National Archives about how linked data was opening up eight hundred years of legal history. That month, the National Archives launched legislation.gov.uk to provide public access to more than eight centuries of the legal history in England, Scotland, Wales and Northern Ireland. Just over three years later, I stepped off the Tube at the St. James Park Station and walked over to meet him in person and learn how his aspirations for legislation.gov.uk had met up with reality.

Over a cup of tea, Sheridan caught me up on the progress that his team has made in digitizing documents and improving the laws of the land. There are now 2 million monthly unique visitors to legislation.gov.uk every month, with 500+ million page views annually. People really are reading Parliament’s output, he observed, and increasingly doing so on tablets and mobile devices. The amount of content flowing into the site is considerable: according to Sheridan, the United Kingdom is passing laws at an estimated rate of 100,000 words every month, or twice as much as the complete works of Shakespeare.

Notable improvements over the years include the ability to compare the original text of legislation versus the latest version (as we did with the Magna Carta) and view a timeline of changes using a slider for navigation, exploring any given moment in time. Sheridan was particularly proud of the site’s rendering of legislation in HTML, include human-readable permanent uniform resource locators (URLS) and the capacity to produce on-demand PDFs of a given document. (This isn’t universally true: I found some orders appear still as PDFs).

More specifically, Sheridan highlighted a “good law” project, wherein the Office of the Parliamentary Counsel (OPC) of Britain is working to help develop plain language laws that are “necessary, clear, coherent, effective and accessible.” A notable component of this good law project is an effort to apply a tool used in online publishing, software development and advertising — A/B testing — to testing different versions of legislation for usability.

The video of a TedX talk embedded below by Richard Heaton, the permanent secretary of the United Kingdom’s Cabinet Office and first parliamentary counsel, explores the idea of “good law” at more length:

Sheridan went on to describe one of the more ambitious online collaborations between a government and its citizens I had heard of to date, a novel cross-Atlantic challenge co-sponsored by the UK and US governments, and a hairy legal technology challenge bearing down upon societies everywhere: what happens when software interprets the law?

For instance, he suggested, consider the increasing use of Oracle software around legislation. “As statutes are interpreted by software, what’s introduced by the code? What about quality testing?”

As this becomes a data problem, “you need information to contextualize it,” said Sheridan. “If you’re thinking about legislation as code, and as data, it raises huge questions for the rule of law.”

Open data as a platform

In the video below, John Sheridan talks about the benefits of opening up government data using application programming interfaces:

Sheridan has been one of the world’s foremost proponents of publishing legislative data through APIs, an approach that has come under criticism by open government data advocates after the government shutdown in the United States. (In 2014, forward-thinking governments publishing open data might consider provide basic visualization tools to site visitors, API access for third-party developers and internal users, and bulk data downloads.) One key difference between the approach of his team and other government entities might be that the National Archives are “dogfooding,” or consuming the same data through the same interface that they expect third-parties to use, as Sheridan wrote last March:

“We developed the API and then built the legislation.gov.uk website on top of it. The API isn’t a bolt-on or additional feature, it is the beating heart of the service. Thanks to this approach it is very easy to access legislation data – just add /data.xml or /data.rdf to any web page containing legislation, or /data.feed, to any list or search results. One benefit of this approach is that the website, in a way, also documents the API for developers, helping them understand this complex data.”

Perhaps because of that perspective, Sheridan, was as supportive of an APIs when we talked this September as he had been in 2012:

The legislation.gov.uk API has changed everything for us. It powers our website. It has enabled us to move to an open data business model, securing the editorial effort we need from the private sector for this important source of public data. It allows us to deliver information and services across channels and platforms through third party applications. We are developing other tools that use the API, using Linked Data – from recording the provenance of new legislation as it is converted from one format to another, to a suite of web based editorial tools for legislation, including a natural language processing capability that automatically identifies the legislative effects. Everything we do is underpinned by the API and Linked Data. With the foundations in place, the possibilities of what can be done with legislation data are now almost limitless.

Sheridan noted to me that the United Kingdom’s legislative open government data efforts are now acting as a platform for large commercial legal publishers and new entrants, like mobile legislative app, iLegal.

ilegal-launch-website_05_indexThe iLegal app content is derived from the legislation.gov.uk API and offers handy features, like offline access to all items of legislation. iLegal currently costs £49.99/$74.99 annually or £149.99/$219.99 for a lifetime subscription, which might seem steep but is a fraction of the cost of of Halsbury’s Statutes, currently listed at £9,360.00 from Lexis-Nexis.

This approach to publishing the laws of the land online, in structured form under an open license, is an instantiation of the vision for Law.gov that citizen archivist Carl Malamud has been advocating for in the United States. 2013 saw some progress in that vein when the U.S. House of Representatives publishes U.S. Code as open government data.)

What’s notable about the United Kingdom’s example, however, is that less then a decade ago, none of this could have been possible. Why? As ScraperWiki founder Francis Irving explained, the UK’s database of laws was proprietary data until December 2006. Now, however, the law of the land is released back to the people as it is updated, a living code available in digital form to any member of the public that wishes to read or reuse it.

The United Kingdom, however, has moved beyond simply publishing legislation as open data: they’re actively soliciting civic participation in its maintenance and improvement. For the last year, the National Archives has been guiding the world’s leading commercial open data curation project.

“We are using open data as business model for fulfilling public services,” said Sheridan, in our interview. “We train people to do editorial work. They are paid to improve data. The outputs are public.”

In other words, the open government data always remains free to the people through legislation.gov.uk but any academic, nonprofit or commercial entity can act to add value to it and sell access to the resulting applications, analyses or interfaces.

As far as Sheridan could recall, this was the only such example in the government of the United Kingdom where such a feedback loop exist. The closest parallels in the United States is the U.S. Agency for International Development crowdsourcing geocoding 117,000 loan records with the help of online volunteers [Case Study] or the citizen archivist program of the U.S. National Archives.

Since the start of the UK project, they have doubled the number of people working on their open data, Sheridan told me. “The bottleneck is training,” he said. “We have almost unlimited editorial expertise available through our website. We define the process and rules, and then let anyone contribute. For example, we’re now working on revising legislation, identifying changes, researching it — when it comes in, what it affects — and then working with editor. Previous to this effort, government hasn’t been able to revise secondary legislation.”

Sheridan said that the next step is feedback for other editorial values.

“We’re looking for more experts,” he said. “They’re generally paid for by someone. It’s very close to open source software model. They must be able to demonstrate competence. There’s a 45-minute test, which we’re now given to thousands of people.”

If this continues to work, distributed online collaboration is a “brilliant way to help improve the quality of law,” said Sheridan.

“It’s a way to get the work done — and the work is really hard. You have to invest time and energy, and you must protect the reputation of the Archive. This is somewhat radical for the nation’s statute book. We have redesigned the process so people can work with us. It’s not a wiki, but participation is open. It’s peer production.”

A trans-Atlantic challenge to map legislative data

large_opaque

In September, Sheridan also told me about an unusual challenge that has just gone live at Challenge.gov, the United States’ flagship prizes and competitions platform: a contest to assess the compatibility of Akoma Ntoso with U.S. Congress and U.K. Parliament markup languages.

The U.K. National Archives and U.S. Library of Congress have asked for help mapping elements from bills to the most recent Akoma Ntoso schema. (Akoma Ntoso is an emerging global standard for machine-readable data describing parliamentary, legislative and judiciary documents.) The best algorithm that maps U.S. bill XML or UK bill XML to Akoma Ntoso XML, including necessary data files and supporting documentation, will win $10,000.

If you have both skills and interest, get cracking: the challenge closes on December 31, 2013.

What is the value of open data?

This morning, the New America Foundation hosted a forum on the value of open data. Archived video of the event is embedded below:


Video streaming by Ustream

The event featured comments from deputy United States chief technology officer Nick Sinai, the authors of the McKinsey report on the economic value of open data, and a panel of experts, moderated by yours truly.

Advocates Release Best Practices for Making Open Government Data “License-Free”

CC-0-PD-blog1As more and more governments release data around the world, the conditions under which it is published and may be used will become increasingly important. Just as open formats make data easier to put to work, open licenses make it possible for all members of the public to use it without fear.

Given that wonky but important issue, it’s important that governments that want to maximize the rewards of the work involved in cleaning and publishing open government data get the policy around its release right. Today, several open government advocates have released an updated Best-Practices Language for Making Data “License-Free”, which can found online at at theunitedstates.io/licensing.

“In short what we say is ‘Use Creative Commons Zero (CC0),’ which is a public domain dedication,” said Josh Tauberer, the founder of Govtrack.us, via email. “We provide recommended language to put on government datasets and software to put the data and code into the world-wide public domain. In a way, it’s the opposite of a license.

Tauberer, Eric Mill, developer at the Sunlight Foundation, and Jonathan Gray, director of policy and ideas at the Open Knowledge Foundation, who have been working on the guidance since May, all blogged about the new guidance:

“Back in May, the Administration’s Memorandum on Open Data created very confusing guidance for agencies about what constitutes open data by saying open data should be ‘openly licensed’,” explained Tauberer, via email. “In response to that, we began working on guidance for federal agencies for how to make sure their data in open under the definition in the 8 Principles of Open Government Data.”

The basic issue, he said, is that the memorandum directed agencies to make data open but, in the view of these advocates, told agencies the wrong thing about what open data actually means. “We’re correcting that with precise, actionable direction,” said Tauberer.

What would the consequences of United States government entities not adopting this guidance be?

“Because M-13-13 required open licensing as the new default, I worry about agencies taking the guidance too literally and applying licensing where they might not have before, even if the work is exempt from copyright,” said Tauberer. “Or they may now consider open licensing of works produced by a contractor to be the new norm, since it is permitted by M-13-13, but for certain core information produced by government this would be a major step backward.”

Getting ahead of these kinds of issues is not an abstract issue, similar to concerns about language regarding the “mosaic effect” in the U.S. open data policy.

“Imagine if after FOIA’ing an agency’s deliberative documents, The New York Times was legally required to provide attribution to a contractor, or, worse, to the government itself,” said Tauberer. “The federal government is relying more and more on contractors and lawyers, so it’s important that we reinforce these norms now.”

The language has been endorsed by many of the prominent open government advocates in the world, including the Sunlight Foundation, the Open Knowledge FoundationPublic Knowledge, The Center for Democracy and Technology, The Electronic Frontier Foundation, The Free Law Project, the OpenGov Foundation, Carl Malamud at Public.Resource.Org, Jim Harper at WashingtonWatch.com, Citizens for Responsibility and Ethics in Washington, and MuckRock News.

While it remains to be seen if the White House Office of Management and Budget merges this best practice into its open data policy, the advocates have already had success getting it adopted.

“Since we first published the guidance in August, it’s led to three government projects using our advice,” said Tauberer. “Partly in response to our nudging, in October OSTP’s Project Open Data re-licensed its schema for federal data catalog inventory files. (It had been licensed under CC-BY because of non-governmental contributors to the schema, but now it uses CC0.) In September and October, The CFPB followed our guidance and applied CC0 to their “qu” project and their eRegs platform.”