Congress releases open data on bill status

us-capitol-dome-sun

Imagine searching Facebook, Google or Twitter for the status of a bill before Congress and getting an instant result. That future is now here, but it’s not evenly implemented yet.

When the Library of Congress launched Congress.gov in 2012, they failed to release the data behind it. Yesterday, that changed when the United States Congress started releasing data online about the status of bills.

For the open government advocates, activists and civic hackers that have been working for over a decade for this moment, seeing Congress turn on the data tap was a historic shift.

//platform.twitter.com/widgets.js

Congressional leaders from both sides of the aisle applauded the release of House and Senate bill status information by the U.S. Government Printing Office and Library of Congress.

“Today’s release of bill status information via bulk download is a watershed moment for Congressional transparency,” said House Majority Leader Kevin McCarthy (R-CA), in a statement. “By modernizing our approach to government and increasing public access to information, we can begin to repair the relationship between the people and their democratic institutions. The entire Congressional community applauds the dedication of the Legislative Branch Bulk Data Task Force, the Office of the Clerk, the House Appropriations Committee, GPO, and the Library of Congress, which worked together to make this progress possible.”

“Building off previous releases of bills and summaries, today’s release of bill status information largely completes the overarching goal of providing bulk access to all the legislative data that traditionally has been housed on Thomas.gov and now also resides on Congress.gov,” said Democratic Whip Steny Hoyer (D-MD). “This is a major accomplishment that has been many years in the making. It goes a long way toward making Congress more transparent and accessible to innovation through third party apps and systems. I applaud the dedicated civil servants who made this possible at the Legislative Branch service agencies, and I want to thank the Bulk Data Task Force for their leadership in this effort. While this largely completes a major goal of the Task Force, I look forward to continuing to workwith them to further modernize the U.S. Congress.”

The impact of open government data releases depend upon publicy and political agency. Releasing the states of bills before Congress in a way that can be baked in by third party apps and services is a critical, laudable step in that direction, but much more remains to be done in making the data more open and putting it to use and re-use. If the Library of Congress opens up an application programming interface for the data that supplies both Congress.gov and the public, it would help to reduce the asynchrony of legislative information between the public and elites who can afford to pay for Politico’s Legislative Compass or Quorum Analytics that is the status quo today.

In an era when Congress job approval ratings and trust in government are at historic lows, the shift didn’t make news beyond the Beltway. Govtrack.us, which is based upon data scraped from the Library of Congress, has been online for years. Until this XML data is used by media and technology companies in ways that provide the public with more understanding of what Congress is doing on their behalf and give them more influence in that legislative process, that’s unlikely to change quickly.

U.S. Civil Society Groups release model National Open Government Action Plan

This is the week for seeking feedback on open government in the United States. 4 days ago, the White House published a collaborative online document that digitized the notes from an open government workshop held during Sunshine Week in March. Today, Abby Paulson from OpenTheGovernment.org uploaded a final draft of a Model National Action Plan to the Internet, as a .doc. I’ve uploaded it to Scribd and embedded it below for easy browsing.

Nelson shared the document over email with people who contributed to the online draft.

Thank you so much for contributing to the civil society model National Action Plan. The Plan has made its way from Google Site to Word doc (attached)! We will share these recommendations with the White House, and I encourage you to share your commitments with any government contacts you have. If you notice any errors made in the transition from web to document, please let me know. If there are any other organizations that should be named as contributors, we will certainly add them as well. The White House’s consultation for their plan will continue throughout the summer, so there are still opportunities to weigh in. Additional recommendations on surveillance transparency and beneficial ownership are in development. We will work to secure meetings with the relevant agencies and officials to discuss these recommendations and make a push for their inclusion in the official government plan. So, expect to hear from us in the coming weeks!

Half empty or half full? Mixed reactions to Pew research on open data and open government

Yesterday, I wrote up 15 key insights from the Pew Internet and Life Project’s new research on the American public’s attitude towards open data and open government. If you missed it, what people think about government data and the potential impact of releasing it is heavily influenced by the prevailing low trust in government and their politics.

mixed-hopes-open-data-improve-pew

Media coverage of the survey reflected the skepticism of the reporters (“Most Americans don’t think government transparency matters a damn“) or of the public (“Who cares about open data” and “Americans not impressed by open government initiatives“). This photo by Pete Souza below might be an apt image for this feeling:

not-impressed-souza-obama

Other stories pulled out individual elements of the research (“Open data on criminals and teachers is a-okay, say most US citizens” or mixed results (“People Like U.S. Open Data Initiatives, But Think Government Could Do More” and “Sorry, open data: Americans just aren’t that into you“) or general doubts about an unfamiliar topic (“Many Americans Doubt Government Open Data Efforts“). At least one editor’s headline suggested that the results were an indictment of everything government does online: (“Americans view government’s online services and public data sharing as a resounding ‘meh’.) Meh, indeed.

As usual, keep a salt shaker handy as you browse the headlines and read the original source. The research itself is more nuanced than those headlines suggest, as my interview with the lead researcher on the survey, John Horrigan, hopefully made clear.

Over at TechPresident, editor-in-chief Micah Sifry saw a glass half full:

  • Digging deeper into the Pew report, it’s interesting to find that beyond the “ardent optimists” (17% of adults) who embrace the benefit of open government data and use it often, and the “committed cynics” (20%) who use online government resources but think they aren’t improving government performance much, there’s a big group of “buoyant bystanders” (27%) who like the idea that open data can improve government’s performance but themselves aren’t using the internet much to engage with government. (Heads up Kate Krontiris, who’s been studying the “interested bystander.”)
  • It’s not clear how much of the bystander problem is also an access problem. According to a different new analysis done by the Pew Research Center, about five million American households with school-age children–nearly one in five–do not have high-speed internet access at home. This “broadband gap” is worst among households with incomes under $50,000 a year.

Reaction from foundations that have advocated, funded or otherwise supported open government data efforts went deeper. Writing for the Sunlight Foundation, communications director Gabriela Schneider saw the results from the survey in a rosy (sun)light, seeing public optimism about open government and open data.

People are optimistic that open data initiatives can make government more accountable. But, many surveyed by Pew are less sure open data will improve government performance. Relatedly, Americans have not quite engaged very deeply with government data to monitor performance, so it remains to be seen if changes in engagement will affect public attitudes.

That’s something we at Sunlight hope to positively affect, particularly as we make new inroads in setting new standards for how the federal government discloses its work online. And as Americans shift their attention away from Congress and more toward their own backyards, we know our newly expanded work as part of the What Works Cities initiative will better engage the public, make government more effective and improve people’s lives.

Jonathan Sotsky, director of strategy and assessment for the Knight Foundation, saw a trust conundrum for government in the results:

Undoubtedly, a greater focus is needed on explaining to the public how increasing the accessibility and utility of government data can drive accountability, improve government service delivery and even provide the grist for new startup businesses. The short-term conundrum government data initiatives face is that while they ultimately seek to increase government trustworthiness, they may struggle to gain structure because the present lack of trust in government undermines their perceived impact.

Steven Clift, the founder of e-democracy.org, views this survey as a wakeup call for open data advocates.

One reason I love services like CityGram, GovDelivery, etc. is that they deliver government information (often in a timely way) to the public based on their preferences/subscriptions. As someone who worked in “e-government” for the State of Minnesota, I think most people just want the “information” that matters to them and the public has no particular attachment to the idea of “open data” allowing third parties to innovate or make this data available. I view this survey as a huge wake up call to #opengov advocates on the #opendata side that the field needs to provide far more useful stuff to the general public and care a lot more about outreach and marketing to reach people with the good stuff already available.

Mark Headd, former chief data officer for the City of Philadelphia and current developer evangelist for Accela software, saw the results as a huge opportunity to win hearts and minds:

The modern open data and civic hacking movements were largely born out of the experience of cities. Washington DC, New York City and Chicago were among the first governments to actively recruit outside software developers to build solutions on top of their open data. And the first governments to partner with Code for America – and the majority over the life of the organization’s history – have been cities.

How do school closings impact individual neighborhoods? How do construction permit approvals change the character of communities? How is green space distributed across neighborhoods in a city? Where are vacant properties in a neighborhood – who owns them and are there opportunities for reuse?

These are all the kinds of questions we need people living and working in neighborhoods to help us answer. And we need more open data from local governments to do this.

If you see other blog posts or media coverage that’s not linked above, please let me know. I storified some reactions on Twitter but I’m certain that I missed conversations or opinions.

few-think-govt-data-sharing-effective-pew

There are two additional insights from Pew that I didn’t write about yesterday that are worth keeping in mind with respect to how how Americans are thinking about the release of public data back to the public. First, it’s unclear whether the public realizes they’re using apps and services built upon government data, despite sizable majorities doing so.

Second, John Horrigan told me that survey respondents universally are not simply asking for governments to make the data easier to understand so that they can figure out what I want to figure out: what people really want is intermediaries to help them make sense of the data.

“We saw a fair number of people pleading in comments for better apps to make the data make sense,” said Horrigan. “When they went online, they couldn’t get budget data to work. When the found traffic data, couldn’t make it work. There were comments on both sides of the ledger. Those that think government did an ok job wish they did this. Those that thin government is doing a horrible job also wish they did this.”

This is the opportunity that Headd referred to, and the reason that data journalism is the critical capacity that democratic governments which genuinely want to see returns on accountability and transparency must ensure can flourish in civil society.

If a Republican is elected as the next President of the United States, we’ll see if public views shift on other fronts.

USASpending.gov addresses some data issues, adds Github issues tracker for feedback

usaspending

On April 1st, some reporters, open government advocates and people in industry may have hoped that a new redesign of USASpending.gov, the flagship financial transparency website of the United States government, was just a poorly conceived April Fool’s joke. Unfortunately, an official statement about the USASpending.gov redesign at the U.S. Treasury’s blog confirmed that the redesign was real. Analysts, media and businesses that rely on the contracting data on the site were loudly decried the decreased functionality of USASpending.gov.

A week later, there’s a still no evidence of deliberate intent on the part of Treasury not to publish accurate spending data or break the tool, despite headlines about rolling back transparency. Rather, it looks more likely that there were been a number of mistakes or even unavoidable errors made in the transitioning the site and data from a bankrupt federal contractor. There was certainly poor communication with the business community and advocates who use the site, a reality that Luke Fretwell helpfully suggested at Govfresh that other government agencies work to avoid next time.

Today, as Fretwell first reported, the federal government launched a new repository for tracking issues on USASpending.gov on Github, the social coding site that’s become an increasingly important platform for 18F, which committed to developing free and open source software by default last year.

In an email to the White House’s open government Google Group, Corinna Zarek, the senior advisor for open government in the Obama administration, followed up on earlier concerns about the redesign:

The USAspending team has been working to improve the usability of the site and has made some great strides to make it easier for average citizens to navigate information. But at the same time, we all understand that some of our expert users (like a lot of you) seek more technical information and the team is striving to meet your needs as well.

This is definitely a work in progress so please keep working with the team as it iterates on the best ways to improve function of the site while maintaining the content you seek. Your initial comments have been really helpful and the USAspending team is already working to address some of them.

Zarek also said that several of the problems with data that people have reported been addressed, including the capacity to download larger data sets and define specific dates in search, and asked for more feedback.

Specifically, this week the team addressed data export issues to allow the ability to specify date ranges to download data, added the bulk file format API, and modified the download capability so larger datasets can be downloaded. Additionally, data archives are being added continually. This week, they loaded the 2014 and 2015 delta files that show the new transactions in the last month. You can keep track of the ongoing improvements on the “What’s new” page.

Please keep sharing your feedback and continue working with the USAspending team as it makes improvements to the site. You can do this through the site’s contact page or on the new Github page where you can report issues and track them in the open.

If you find bugs, let the feds know about them on Github so that everyone can see the issues and how they’re addressed. As Mollie Walker reported for FierceGovernmentIT, there’s still missing functionality yet to be restored.

[Image Credit: Govfresh, via USASpending.gov]

U.S. government launches online traffic analytics dashboard for federal websites

There are roughly 1,361 .gov domains* operated by the executive branch of the United States federal government, 700-800 of which are live and in active use. Today, for the first time, the public can see how many people are visiting 300 executive branch government domains in real-time, including every cabinet department, by visiting analytics.usa.gov.

According to a post on the White House blog, the United States Digital Service “will use the data from the Digital Analytics Program to focus our digital service teams on the services that matter most to the American people, and analyze how much progress we are making. The Dashboard will help government agencies understand how people find, access, and use government services online to better serve the public – all while protecting privacy.  The program does not track individuals. It anonymizes the IP addresses of all visitors and then uses the resulting information in the aggregate.”

On Thursday morning, March 19th, tax-related services, weather, and immigration status are all popular. Notably, there’s an e-petition on the White House WeThePeople platform listed as well, adding data-driven transparency to what’s popular there right now.
analytics_usa_gov___The_US_government_s_web_traffic_

Former United States deputy chief technology officer Nick Sinai is excited about seeing the Web analytics data opened up online. Writing for the Harvard Shorenstein Center, where he is currently a fellow, Sinai adds some context for the new feature:

“Making government web performance open follows the digital services playbook from the new U.S. Digital Services,” he wrote. “Using data to drive decisions and defaulting to open are important strategies for building simple and useful citizen-facing digital services. Teal-time and historical government web performance is another example of how open government data holds the promise of improving government accountability and rebuilding trust in government.”

Here’s what the U.S. digital services team says they’ve already learned from analyzing this data:

Here’s what we’ve already learned from the data:

  • Our services must work well on all devices. Over the past 90 days, 33% all traffic to our sites came from people using phones and tablets. Over the same period last year, the number was 24%. Most of this growth came from an increase in mobile traffic. Every year, building digital services that work well on small screens becomes more important.
  • Seasonal services and unexpected events can cause surges in traffic. As you might expect, tax season is a busy time for the IRS. This is reflected in visits to pages on IRS.gov, which have more than tripled in the past 90 days compared with the previous quarter. Other jumps in traffic are less easy to predict. For example, a recently-announced settlement between AT&T and the Federal Trade Commissiongenerated a large increase in visits to the FTC’s website. Shortly after the settlement was announced, FTC.gov had four times more visitors than the same period in the previous year. These fluctuations underscore the importance of flexibility in the way we deploy our services so that we can scale our web hosting to support surges in traffic as well as save money when our sites are less busy.
  • Most people access our sites using newer web browsers. How do we improve digital services for everyone when not all web browsers work the same way? The data tells us that the percentage of people accessing our sites using outdated browsers is declining steadily. As users adopt newer web browsers, we can build services that use modern features and spend less time and money building services that work on outdated browsers. This change will also allow us to take advantage of features found in modern browsers that make it easier to build services that work well for Americans with disabilities, who access digital services using specialized devices such as screen readers.

If you have ideas, feedback or questions, the team behind the dashboard is working in the open on Github.

Over the coming months, we will encourage more sites to join the Digital Analytics Program, and we’ll include more information and insights about traffic to government sites with the same open source development process we used to create the Dashboard. If you have ideas for the project, or want to help improve it, let us know by contributing to the project on GitHub or emailing digitalgov@gsa.gov.

That last bit is notable; as its true all of the projects that 18F works on, this analytics dashboard is open source software.

There are some interesting additional details in 18F’s blog post on how the analytics dashbard was built, including the estimate that it took place “over the course of 2-3 weeks” with usability testing at a “local civic hacking meetup.”

First, that big number is made from HTML and D3, a Javascript library, that downloads and render the data. Using open standards means it renders well across browsers and mobile devices.

Second, 18F made an open source tool to manage the data reporting process called “analytics-reporter” that downloads Google Analytics reports and transforms that data into JSON.

Hopefully, in the years ahead, the American people will see more than the traffic to .gov websites: they’ll see concrete performance metrics like those displayed for the digital services the United Kingdom’s Government Digital Services team publishes at gov.uk/performance, including uptime, completion rate and satisfaction rate.

In the future, if the public can see the performance of Heathcare.gov, including glitches, or other government digital services, perhaps the people building and operating them will have more accountability for uptime and quality of service.

National Security Archive finds 40% E-FOIA compliance rate in federal government agencies

underConstruction

For Sunshine Week 2015, the National Security Archive​ conducted an audit of how well 165 federal government agencies in the United States of America comply with the E-FOIA Act of 1996. They found that only 67 of them had online libraries that were regularly updated with a significant number of documents released under the Freedom of Information Act. The criteria for the 165 agencies were that they had to have a chief Freedom of Information Officer and components that handled more than 500 FOIA requests annually.

Almost a decade after the E-FOIA Act, that’s about a 40% compliance rate. I wonder if the next U.S. Attorney General or the next presidential administration will make improving on this poor performance priority. It’s important for The United States Department of Justice​ to not only lead by example but push agencies into the 21st century when it comes to the Freedom of Information Act.

It would certainly help if Congress passed FOIA reform.

On that count, the Archive highlights a relevant issue in the current House and Senate FOIA reform bills in Congress: the FOIA statute states that documents that are “likely to become the subject of subsequent requests” should be published electronic reading rooms:

“The Department of Justice’s Office of Information Policy defines these records as “frequently requested records… or those which have been released three or more times to FOIA requesters.” Of course, it is time-consuming for agencies to develop a system that keeps track of how often a record has been released, which is in part why agencies rarely do so and are often in breach of the law. Troublingly, both the current House and Senate FOIA bills include language that codifies the instructions from the Department of Justice.

The National Security Archive believes the addition of this “three or more times” language actually harms the intent of the Freedom of Information Act as it will give agencies an easy excuse (“not requested three times yet!”) not to proactively post documents that agency FOIA offices have already spent time, money, and energy processing. We have formally suggested alternate language requiring that agencies generally post “all records, regardless of form or format that have been released in response to a FOIA request.”

This is a point that Members of Congress should think through carefully as they take another swing at reform. As I’ve highlighted elsewhere, FOIA requests that industry make are an important demand signal to show where data with economic value lies. (It’s also where the public interest tends to lie, with respect to FOIA requests from the media.)

While it’s true that it would take time and resources to build and maintain a system that tracks such requests by industry, there should already be a money trail from the fees paid to the agency. If FOIA reform leads to modernizing how it’s implemented, perhaps tying FOIA.gov to Data.gov might finally take place. The datasets are the subject of the most FOIA requests are the ones that should be prioritized for proactive disclosure online.

Adding a component that identifies which data sets are frequently requested, particularly periodically, should be a priority across the board for any administration that seeks to “manage information as an asset.” Adding the volume and periodicity of requests to the expanding national enterprise data inventory might naturally follow. It’s worth noting, too, that reform of the FOIA statute may not be necessary to achieve this end, if the 18F team working on modernizing FOIA software worked on it.

In a step towards sunlight, United States begins to publish a national data inventory

20130929-142228.jpg
Last year, a successful Freedom of Information request for the United States enterprise data inventory by the Sunlight Foundation was a big win for open government, nudging Uncle Sam towards a better information policy through some creative legal arguments. Today, the federal government started releasing its enterprise indices at data.gov. You can browse the data for individual agencies, like the feed for the Office for Personnel Management, using a JSON viewer like this one.

“Access to this data will empower journalists, government officials, civic technologists, innovators and the public to better hold government accountable,” said Sunlight Foundation president Chris Gates, in a statement. “Previously, it was next to impossible to know what and how much data the government has, and this is an unprecedented window into its internal workings. Transparency is a bedrock principle for democracy, and the federal government’s response to Sunlight’s Freedom of Information request shows a strong commitment to open data. We expect to see each of these agencies continue to proactively release their data inventories.”

Understanding what data an organization holds is a critical first step in deciding how it should be stored, analyzed or published, shifting towards thinking about data as an asset. That’s why President Barack Obama’s executive order requiring federal agencies to catalog the data they have was a big deal. When that organization is a democratic government and the data in question was created using taxpayer funds, releasing the inventory of the data sets that it holds is a basic expression of open and accountable government.

2014 Open Knowledge Index shows global growth of open data, but low overall openness

Today, Open Knowledge released its global 2014 Open Data Index, refreshing its annual measure of the accessibility and availability of government releases of data online. When compared year over year, these indices have shown not only the relatives openness of data between countries but also the slow growth in the number of open data sets. Overall, however, the nonprofit found that the percentage of open datasets across all 97 surveyed countries (up from 63 in 2013) remained low, at only 11%.

“Opening up government data drives democracy, accountability and innovation,” said Rufus Pollock, the founder and president of Open Knowledge, in a statement. “It enables citizens to know and exercise their rights, and it brings benefits across society: from transport, to education and health. There has been a welcome increase in support for open data from governments in the last few years, but this year’s Index shows that real progress on the ground is too often lagging behind the rhetoric.”

The map below can be explored in interactive form at the Open Knowledge website.

Open_government_data_around_the_world__right_now____Global_Open_Data_Index_by_Open_Knowledge

Open Knowledge also published a refreshed ranking of countries. The United Kingdom remains atop the list, followed by Denmark and France, which moved up from number 12 in 2013. India moved into the top 10, from #27, after the relaunch of its open data platform.

Place_overview___Global_Open_Data_Index_by_Open_Knowledge

Despite the rhetoric emanating from Washington, the United States is ranked at number 8, primarily due to deficiencies in open data on government spending and an open register of companies. Implementation of the DATA Act may help, as would the adoption of an open corporate identified by the U.S. Treasury.

Below, in an interview from 2012, Pollock talks more about the relationship between open data and open government.

More details and discussion are available at the Open Knowledge blog.

Thoughts on the future of the US CIO, from capabilities to goals

vanroekel

This weekend, ZDNet columnist Mike Krigsman asked me what I thought of the tenure of United States chief information officer Steven VanRoekel and, more broadly, what I thought of the role and meaning of the position in general. Here’s VanRoekel’s statement to the press via Federal News Radio:

“When taking the job of U.S. chief information officer, my goal was to help move federal IT forward into the 21st Century and to bring technology and innovation to bear to improve IT effectiveness and efficiency. I am proud of the work and the legacy we will leave behind, from launching PortfolioStat to drive a new approach to IT management, the government’s landmark open data policy to drive economic value, the work we did to shape the mobile ecosystem and cloud computing, and the culmination of our work in the launch of the new Digital Service, we have made incredible strides that will benefit Americans today and into the future,” VanRoekel said in a statement. “So it is with that same spirit of bringing innovation and technology to bear to solve our most difficult problems, that I am excited to join USAID’s leadership to help stop the Ebola outbreak. Technology is not the solution to this extremely difficult task but it will be a part of the solution and I look forward to partnering with our federal agencies, non-profit organizations and private sector tech communities to help accelerate this effort.”

Here’s the part of what I told Krigsman that ended up being published, with added hyperlinks for context:

As US CIO, Steven VanRoekel was a champion of many initiatives that improved how technology supports the mission of the United States government. He launched an ambitious digital government strategy that moved further towards making open data the default in government, the launch of the U.S. Digital Service, 18F, and the successful Presidential Innovation Fellows program, and improved management of some $80 billion dollars in annual federal technology spending through PortfolioStat.

As was true for his predecessor, he was unable to create fundamental changes in the system he inherited. Individual agencies still have accountability for how money is spent and how projects are managed. The nation continues to see too many government IT projects that are over-budget, don’t work well, and use contractors with a core competency in getting contracts rather than building what is needed.

The U.S. has been unable or unwilling to reorganize and fundamentally reform how the federal government supports its missions using technology, including its relationship to incumbent vendors who fall short of efficient delivery using cutting-edge tech. The 113th Congress has had opportunities to craft legislative vehicles to improve procurement and the power of agency CIOs but has yet to pass FITARA or RFP-IT. In addition, too many projects still look like traditional enterprise software rather than consumer-facing tools, so we have a long way to go to achieve the objectives of the digital playbook VanRoekel introduced.

There are great projects, public servants and pockets of innovation through the federal government, but culture, hiring, procurement, and human resources remain serious barriers that continue to result in IT failures. The next U.S. CIO must be a leader in all respects, leading by example, inspiring, and having political skill. It’s a difficult job and one for which it is hard to attract world-class talent.

We need a fundamental shift in the system rather than significant tweaks, in areas such as open source and using the new Digital Service as a tool to drive change. The next US CIO must have experience managing multi-billion dollar budgets and be willing to pull the plug on wasteful or mismanaged projects that serve the needs of three years ago, not the future.

In a win for open government advocacy, DC removes flaws in its municipal open data policy

Update:

dcgov_logoIt’s a good day for open government in the District of Columbia. Today, DC’s Office of the Chief Technology Officer (OCTO) has updated the Terms and Conditions for DC.gov and the city’s new open data platform, addressing some of the concerns that the Sunlight Foundation and Code for DC expressed about the new open data policy introduced in July. The updated terms and conditions rolling out onto the city’s digital civic architecture this afternoon. “Today’s changes are really focused on aligning DC.Gov’s Terms and Conditions of Use with the new open data and transparency policy released this summer,” explained Mike Rupert, the communications director for OCTO,” in an interview. “The site’s T&C hadn’t been updated in many years,” according to Rupert. The new T&C will apply to DC.gov, the open data platform and other city websites. “It is encouraging that DC is taking steps toward considering feedback and improving its Terms and Conditions, but there is still room for improvement in the broader scope of DC’s policies,”said Alisha Green, a policy associate with Sunlight Foundation’s local policy team.  “We hope those implementing DC’s new open data policy will actively seek stakeholder input to improve upon what the policy requires. The strength of the policy will be in its implementation, and we hope DC will take every opportunity to make that process as open, collaborative and impactful as possible.” So, OCTO both heard and welcomed the feedback from open government advocates regarding the policy and agreed that the policy implications of the terms and conditions were problematic. Certain elements of the previous Terms and Conditions of Use (Indemnity, Limitation of Liability) could have chilled the understanding of the public’s right to access and have been eliminated,” said Rupert. Those were the sections that prompted civic hacker Josh Tauberer to wonder whether he needed a lawyer to hack in DC are simply gone, specifically that Indemnity and Liability Section. Other sections, however, still remain. The revised policy I obtained before the updated terms and conditions went online differs in a couple of ways from the one that just that went online. First, the Registration section remains, as does the Conduct section, although DC eliminated the 11 specific examples. That said, it’s better, and that’s a win. District officials remain cautious about how and where reuse might occur, they’re going to at least let the data flow without a deeply flawed policy prescription. “While we want to be mindful of and address the potential for harm to or misuse of District government information and data, the Terms and Conditions of Use should promote the new open data and transparency philosophy in a more positive manner,” said Rupert. Sharp-eyed readers of the new policy, however, will note that DC’s open data and online information has now been released to the public under a Creative Commons license, specifically Attribution 3.0 United States. That means that anyone who uses DC’s open data is welcome to “Share — copy and redistribute the material in any medium or format and Adapt — remix, transform, and build upon the material — for any purpose, even commercially,” as long as they provide “Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.” When asked about the CC license choice, Rupert said that “The new copyright language from Creative Commons – which as you is know is becoming the international standard – better states the overriding principle of the public’s right to web content and data. ” That did not sit entirely well with open government advocates who hold that making open data license free is a best practice. Asked for comment, Tauberer emailed the following statement in a response to the draft of the revision, welcoming the District’s responsiveness but questioning the premise of the District of Columbia having any “terms and conditions” for the public using open government data at all.

The new terms drop the most egregious problems, but these terms still don’t count as “open.” Should I expect a lawsuit if I don’t tip my hat and credit the mayor every time I use the data we taxpayers paid to create? Until the attribution requirement is dropped, I will recommend that District residents get District data through Freedom of Information Act requests. It might take longer, but it will be on the people’s terms, not the mayor’s. It’s not that the District shouldn’t get credit, but the District shouldn’t demand it and hold civil and possibly criminal penalties over our heads to get it. For instance, yesterday Data.gov turned their attribution requirement into a suggestion. That’s the right way to encourage innovation. All that said, I appreciate their responsiveness to our feedback. Tim from DC GIS spent time at Code for DC to talk about it a few weeks ago, and I appreciated that. It is a step in the right direction, albeit one deaf to our repeated explanation that “open” does not mean “terms of use.

The good news is that DC’s OCTO is listening and has committed to being responsive to future concerns about how it’s handling DC’s online presences and policies. “Several of your questions allude to the overall open data policy and we will definitely be reaching out to you and all other interested stakeholders as we begin implement various elements of that policy,” said Rupert.

Update: On October 29th, DC updated its Terms and Conditions again, further improving them. Tauberer commented on the changes to the open data policy on his blog. In his view, the update represents a step forward and a step back:

In a new update to the terms posted today, which followed additional conversations with OCTO, there were two more great improvements. These terms were finally dropped:

  • agreeing to follow all “rules”, a very ambiguous term
  • the requirement to attribute the data to the District in all uses of the data (it’s now merely a suggestion)

The removal of these two requirements, in combination with the two removed in September, makes this a very important step forward.

One of my original concerns remains, however, and that is that the District has not granted anyone a copyright license to use District datasets. Data per se isn’t protected by copyright law, but the way a dataset is presented may be. The District has claimed copyright over its things before, and it remains risky to use District datasets without a copyright license. Both the September update and today’s update attempted to address this concern but each created more confusion that there was before.

Although today’s update mentions the CC0 public domain dedication, which would be the correct way to make the District data available, it also explicitly says that the District retains copyright:

  • The terms say, at the top, that they “apply only to . . . non-copyrightable information.” The whole point is that we need a license to use the aspects of the datasets that are copyrighted by the District.
  • Later on, the terms read: “Any copyrighted or trademarked content included on these Sites retains that copyright or trademark protection.” Again, this says that the District retains copyright.
  • And: “You must secure permission for reuse of copyrighted … content,” which, as written (but probably not intended), seems to say that to the extent the District datasets are copyrighted, data users must seek permission to use it first. (Among other problems, like side-stepping “fair use” in copyright law.)

With respect to the copyright question, the new terms document is a step backward because it may confuse data users into thinking the datasets have been dedicated to the public domain when in fact they haven’t been.

This post has been updated with comments from Tauberer and the Sunlight Foundation.