I had a blast interviewing Matt Mullenweg, the co-creator of WordPress and CEO of Automattic, last night at the inaugural WordPress and government meetup in DC. UPDATE: Video of our interview and the Q&A that followed is embedded below:
WordPress code powers some 60 million websites, including 22% of the top 10 million sites on the planet and .gov platforms like Broadbandmap.gov. Mullenweg was, by turns, thoughtful, geeky and honest about open source and giving hundreds of millions of people free tools to express themselves, along with quietly principled, with respect to the corporate values for an organization spread between 35 countries, government censorship and the ethics of transparency.
After Mullenweg finished taking questions from the meetup, Data.gov architect Philip Ashlock gave a presentation on how the staff working on the federal government’s open data platform are using open source software to design, build, publish and collaborate, from WordPress to CKAN to Github issue tracking.
According to the third congressionally mandated report released by the Obama administration today (PDF/Text), the number of prizes and challenges conducted under the America COMPETES Act has increased by 50% since 2012, 85% since 2012, and nearly six-fold overall since 2011. 25 different federal agencies offered prizes under COMPETES in fiscal year 2013, with 87 prize competitions in total. The size of the prize purses has also grown as well, with 11 challenges over $100,000 in 2013. Nearly half of the prizes conducted in FY 2013 were focused on software, including applications, data visualization tools, and predictive algorithms. Challenge.gov, the award-winning online platform for crowdsourcing national challenges, now has tens of thousands of users who have participated in more than 300 public-sector prize competitions. Beyond the growth in prize numbers and amounts, Obama administration highlighted 4 trends in public-sector prize competitions:
New models for public engagement and community building during competitions
Growth software and information technology challenges, with nearly 50% of the total prizes in this category
More emphasis on sustainability and “creating a post-competition path to success”
Increased focus on identifying novel approaches to solving problems
The growth of open innovation in and by the public sector was directly enabled by Congress and the White House, working together for the common good. Congress reauthorized COMPETES in 2010 with an amendment to Section 105 of the act that added a Section 24 on “Prize Competitions,” providing all agencies with the authority to conduct prizes and challenges that only NASA and DARPA has previously enjoyed, and the White House Office of Science and Technology Policy (OSTP), which has been guiding its implementation and providing guidance on the use of challenges and prizes to promote open government.
“This progress is due to important steps that the Obama Administration has taken to make prizes a standard tool in every agency’s toolbox,” wrote Cristin Dorgelo, assistant director for grand challenges in OSTP, in a WhiteHouse.gov blog post on engaging citizen solvers with prizes:
In his September 2009 Strategy for American Innovation, President Obama called on all Federal agencies to increase their use of prizes to address some of our Nation’s most pressing challenges. Those efforts have expanded since the signing of the America COMPETES Reauthorization Act of 2010, which provided all agencies with expanded authority to pursue ambitious prizes with robust incentives.
To support these ongoing efforts, OSTP and the General Services Administration have trained over 1,200 agency staff through workshops, onlineresources, and an active community of practice. And NASA’s Center of Excellence for Collaborative Innovation (COECI) provides a full suite of prize implementation services, allowing agencies to experiment with these new methods before standing up their own capabilities.
Sun Microsystems co-founder Bill Joy famously once said that “No matter who you are, most of the smartest people work for someone else.” This rings true, in and outside of government. The idea of governments using prizes like this to inspire technological innovation, however, is not reliant on Web services and social media, born from the fertile mind of a Silicon Valley entrepreneur. As the introduction to the third White House prize report notes:
“One of the most famous scientific achievements in nautical history was spurred by a grand challenge issued in the 18th Century. The issue of safe, long distance sea travel in the Age of Sail was of such great importance that the British government offered a cash award of £20,000 pounds to anyone who could invent a way of precisely determining a ship’s longitude. The Longitude Prize, enacted by the British Parliament in 1714, would be worth some £30 million pounds today, but even by that measure the value of the marine chronometer invented by British clockmaker John Harrison might be a deal.”
Centuries later, the Internet, World Wide Web, mobile devices and social media offer the best platforms in history for this kind of approach to solving grand challenges and catalyzing civic innovation, helping public officials and businesses find new ways to solve old problem. When a new idea, technology or methodology that challenges and improves upon existing processes and systems, it can improve the lives of citizens or the function of the society that they live within.
“Open innovation or crowdsourcing or whatever you want to call it is real, and is (slowly) making inroads into mainstream (i.e. non high-tech) corporate America,” said MIT principal research professor Andrew McAfee, in an interview in 2012.” P&G is real. Innocentive is real. Kickstarter is real. Idea solicitations like the ones from Starbucks are real, and lead-user innovation is really real.”
Prizes and competitions all rely upon the same simple idea behind the efforts like the X-Prize: tapping into the distributed intelligence of humans using a structured methodology. This might include distributing work, in terms of completing a given task or project, or soliciting information about how to design a process, product or policy.
Over the past decade, experiments with this kind of civic innovation around the world have been driven by tight budgets and increased demands for services, and enabled by the increased availability of inexpensive, lightweight tools for collaborating with connected populations. The report claimed that crowdsourcing can save federal agencies significant taxpayer dollars, citing an example of a challenge where the outcome cost a sixth of the estimated total of a traditional approach.
One example of a cost-effective prize program is the Medicaid Provider Screening Challenge that was offered by the Centers for Medicare & Medicaid Services (CMS) as part of a pilot designed in partnership with states and other stakeholders. This prize program was a series of software development challenges designed to improve capabilities for streamlining operations and screening Medicaid providers to reduce fraud and abuse. With a total prize purse of $500,000, the challenge series is leading to the development of an open source multi-state, multi-program provider screening shared-service software program capable of risk scoring, credential validation, identity authentication, and sanction checks, while lowering the burden on providers and reducing administrative and infrastructure expenses for states and Federal programs. CMS partnered with the NASA Center of Excellence for Collaborative Innovation (COECI), NASA’s contractor Harvard Business School, Harvard’s subcontractor TopCoder, and the State of Minnesota. The State of Minnesota is working on full deployment of the software, and CMS is initiating a campaign to encourage other states to leverage the software. COECI estimates that the cost of designing and building the portal through crowdsourcing was one-sixth of what the effort would have cost using traditional software development methods. Through the success of this and subsequent
challenges, CMS is attempting to establish a new paradigm for crowdsourcing state and Federal information technology (IT) systems in a low-cost, agile manner by opening challenges to new players, small companies, and talented individual developers to build solutions which can “plug and play” with existing legacy systems or can operate in a shared, cloud-based environment.
As is always the nature of experiments, many early attempts failed. A few have worked and subsequently grown into sustainable applications, services, data sources, startups, processes and knowledge that can be massively scaled. Years ago, Micah Sifry predicted that the “gains from enabling a culture of open challenges, outsider innovation and public participation” in government were going to be huge. He was right.
Linked below are the administration’s official letters to the House and Senate, reporting the results of last year’s prizes.
I generally agree with the assessment of Washington Post, with respect to how well Maryland Governor Martin O’Malley’s “Ask Me Anything” on Reddit went for him, though I give him far more credit for venturing onto the unruly social news platform than the reporter did. The Post’s report that he only answered 5 questions was just plain incorrect.
O’Malley answered 19 questions this morning, not 5, a fact that could be easily and quickly ascertained by clicking on GovMartinOMalley, the username he used for the AMA, including a (short) answer to a question on mental health that the Post said went unanswered. (An editor made multiple corrections and updates to the Post’s story after I pointed that out.)
He subsequently logged back on in the afternoon to answer more questions, rebutting the Post’s assessment and that of a user: “I don’t know, I’m having fun! This is my first AMA. I had to step away to sign a bunch of bills, and I’m glad to be back,” he commented.
He answered at least one tough question (from a questioner who appears to have joined Reddit today) after doing so, although the answer hasn’t been highly rated:
@bmoreprogressive91: Thanks for doing an AMA. Just one question: How does the Maryland healthcare exchange, which cost taxpayers $90 million to implement before your administration found that it would be cheaper (at an additional $40-50 million) to just replace it than to fix it, show that your Administration has been effectively using taxpayer dollars to better the lives of individual citizens?
O’Malley: No one was more frustrated than I was about the fact that our health exchange website didn’t work properly when we launched. But our health exchange is more than a web site, and we worked hard to overcome the technical problems. We have enrolled about 329,000 people thus far, exceeding the goal we set of 260,000. I often say that we haven’t always succeeded at first, but we have never given up. We learn from both success and failure.
By the end of the day, Maryland’s governor answered 36 questions in total. (You can read a cleanly formatted version of O’Malley’s AMA at Interview.ly). Reddit users rated the quality of some answers much higher than others, with the most popular answer, “Yes,” coming in response to whether he would support a constitutional amendment to reverse the Citizens United decision by the Supreme Court.
To be fair — and reasonable observers should be — Reddit’s utility for extracting answers from a politician isn’t so great, as Alexis Madrigal pointed out after President Barack Obama did an AMA, back in 2012. That said, I’m generally supportive of elected leaders engaging directly with constituents online using the tools and platforms that citizens are active upon themselves.
Popular questions that go unanswered can be instructive and offer some insight into what issues a given politician would rather not talk about in public. As such, they’re fine fodder for media to report upon. The record online, however, also means that when a reporter botches the job or misrepresents an interaction, question or answer, we can all see that, too.
Postscript: Andrew MacRae was critical of the governor and his team’s approach to Reddit and offered a tip for other politicians that venture onto the social news platform for an AMA. More on that in the embedded tweets, below:
As White House special advisor John Podesta noted in January, the PCAST has been conducting a study “to explore in-depth the technological dimensions of the intersection of big data and privacy.” Earlier this week, the Associated Press interviewed Podesta about the results of the review, reporting that the White House had learned of the potential for discrimination through the use of data aggregation and analysis. These are precisely the privacy concerns that stem from data collection that I wrote about earlier this spring. Here’s the PCAST’s list of “things happening today or very soon” that provide examples of technologies that can have benefits but pose privacy risks:
Pioneered more than a decade ago, devices mounted on utility poles are able to sense the radio stations
being listened to by passing drivers, with the results sold to advertisers.26
In 2011, automatic license‐plate readers were in use by three quarters of local police departments
surveyed. Within 5 years, 25% of departments expect to have them installed on all patrol cars, alerting
police when a vehicle associated with an outstanding warrant is in view.27 Meanwhile, civilian uses of
license‐plate readers are emerging, leveraging cloud platforms and promising multiple ways of using the
information collected.28
Experts at the Massachusetts Institute of Technology and the Cambridge Police Department have used a
machine‐learning algorithm to identify which burglaries likely were committed by the same offender,
thus aiding police investigators.29
Differential pricing (offering different prices to different customers for essentially the same goods) has
become familiar in domains such as airline tickets and college costs. Big data may increase the power
and prevalence of this practice and may also decrease even further its transparency.30
reSpace offers machine‐learning algorithms to the gaming industry that may detect
early signs of gambling addiction or other aberrant behavior among online players.31
Retailers like CVS and AutoZone analyze their customers’ shopping patterns to improve the layout of
their stores and stock the products their customers want in a particular location.32 By tracking cell
phones, RetailNext offers bricks‐and‐mortar retailers the chance to recognize returning customers, just
as cookies allow them to be recognized by on‐line merchants.33 Similar WiFi tracking technology could
detect how many people are in a closed room (and in some cases their identities).
The retailer Target inferred that a teenage customer was pregnant and, by mailing her coupons
intended to be useful, unintentionally disclosed this fact to her father.34
The author of an anonymous book, magazine article, or web posting is frequently “outed” by informal
crowd sourcing, fueled by the natural curiosity of many unrelated individuals.35
Social media and public sources of records make it easy for anyone to infer the network of friends and
associates of most people who are active on the web, and many who are not.36
Marist College in Poughkeepsie, New York, uses predictive modeling to identify college students who are
at risk of dropping out, allowing it to target additional support to those in need.37
The Durkheim Project, funded by the U.S. Department of Defense, analyzes social‐media behavior to
detect early signs of suicidal thoughts among veterans.38
LendUp, a California‐based startup, sought to use nontraditional data sources such as social media to
provide credit to underserved individuals. Because of the challenges in ensuring accuracy and fairness,
however, they have been unable to proceed.
The PCAST meeting was open to the public through a teleconference line. I called in and took rough notes on the discussion of the forthcoming report as it progressed. My notes on the comments of professors Susan Graham and Bill Press offer sufficient insight and into the forthcoming report, however, that I thought the public value of publishing them was warranted today, given the ongoing national debate regarding data collection, analysis, privacy and surveillance. The following should not be considered verbatim or an official transcript. The emphases below are mine, as are the words of [brackets]. For that, look for the PCAST to make a recording and transcript available online in the future, at its archive of past meetings.
Susan Graham: Our charge was to look at confluence of big data and privacy, to summarize current tech and the way technology is moving in foreseeable future, including its influence the way we think about privacy.
The first thing that’s very very obvious is that personal data in electronic form is pervasive. Traditional data that was in health and financial [paper] records is now electronic and online. Users provide info about themselves in exchange for various services. They use Web browsers and share their interests. They provide information via social media, Facebook, LinkedIn, Twitter. There is [also] data collected that is invisible, from public cameras, microphones, and sensors.
What is unusual about this environment and big data is the ability to do analysis in huge corpuses of that data. We can learn things from the data that allow us to provide a lot of societal benefits. There is an enormous amount of patient data, data about about disease, and data about genetics. By putting it together, we can learn about treatment. With enough data, we can look at rare diseases, and learn what has been effective. We could not have done this otherwise.
We can analyze more online information about education and learning, not only MOOCs but lots of learning environments. [Analysis] can tell teachers how to present material effectively, to do comparisons about whether one presentation of information works better than another, or analyze how well assessments work with learning styles.
Certain visual information is comprehensible, certain verbal information is hard to understand. Understanding different learning styles [can enable] develop customized teaching.
The reason this all works is the profound nature of analysis. This is the idea of data fusion, where you take multiple sources of information, combine them, which provides much richer picture of some phenomenon. If you look at patterns of human movements on public transport, or pollution measures, or weather, maybe we can predict dynamics caused by human context.
We can use statistics to do statistics-based pattern recognition on large amounts of data. One of the things that we understand about this statistics-based approach is that it might not be 100% accurate if map down to the individual providing data in these patterns. We have to very careful not to make mistakes about individuals because we make [an inference] about a population.
How do we think about privacy? We looked at it from the point of view of harms. There are a variety of ways in which results of big data can create harm, including inappropriate disclosures [of personal information], potential discrimination against groups, classes, or individuals, and embarrassment to individuals or groups.
We turned to what tech has to offer in helping to reduce harms. We looked at a number of technologies in use now. We looked at a bunch coming down the pike. We looked at several tech in use, some of which become less effective because of pervasivesness [of data] and depth of analytics.
We traditionally have controlled [data] collection. We have seen some data collection from cameras and sensors that people don’t know about. If you don’t know, it’s hard to control.
Tech creates many concerns. We have looked at methods coming down the pike. Some are more robust and responsive. We have a number of draft recommendations that we are still working out.
Part of privacy is protecting the data using security methods. That needs to continue. It needs to be used routinely. Security is not the same as privacy, though security helps to protect privacy. There are a number of approaches that are now used by hand that with sufficient research could be automated could be used more reliably, so they scale.
There needs to be more research and education about education about privacy. Professionals need to understand how to treat privacy concerns anytime they deal with personal data. We need to create a large group of professionals who understand privacy, and privacy concerns, in tech.
Technology alone cannot reduce privacy risks. There has to be a policy as well. It was not our role to say what that policy should be. We need to lead by example by using good privacy protecting practices in what the government does and increasingly what the private sector does.
Bill Press: We tried throughout to think of scenarios and examples. There’s a whole chapter [in the report] devoted explicitly to that.
They range from things being done today, present technology, even though they are not all known to people, to our extrapolations to the outer limits, of what might well happen in next ten years. We tried to balance examples by showing both benefits, they’re great, and they raise challenges, they raise the possibility of new privacy issues.
In another aspect, in Chapter 3, we tried to survey technologies from both sides, with both tech going to bring benefits, those that will protect [people], and also those that will raise concerns.
In our technology survey, we were very much helped by the team at the National Science Foundation. They provided a very clear, detailed outline of where they thought that technology was going.
This was part of our outreach to a large number of experts and members of the public. That doesn’t mean that they agree with our conclusions.
Eric Lander: Can you take everybody through analysis of encryption? Are people using much more? What are the limits?
Graham: The idea behind classical encryption is that when data is stored, when it’s sitting around in a database, let’s say, encryption entangles the representation of the data so that it can’t be read without using a mathematical algorithm and a key to convert a seemingly set of meaningless set of bits into something reasonable.
The same technology, where you convert and change meaningless bits, is used when you send data from one place to another. So, if someone is scanning traffic on internet, you can’t read it. Over the years, we’ve developed pretty robust ways of doing encryption.
The weak link is that to use data, you have to read it, and it becomes unencrypted. Security technologists worry about it being read in the short time.
Encryption technology is vulnerable. The key that unlocks the data is itself vulnerable to theft or getting the wrong user to decrypt.
Both problems of encryption are active topics of research on how to use data without being able to read it. There research on increasingly robustness of encryption, so if a key is disclosed, you haven’t lost everything and you can protect some of data or future encryption of new data. This reduces risk a great deal and is important to use. Encryption alone doesn’t protect.
Unknown Speaker: People read of breaches derived from security. I see a different set of issues of privacy from big data vs those in security. Can you distinguish them?
Bill Press: Privacy and security are different issues. Security is necessary to have good privacy in the technological sense if communications are insecure, they clearly can’t be private. This goes beyond, to where parties that are authorized, in a security sense, to see the information. Privacy is much closer to values. security is much closer to protocols.
Interesting thing is that this is less about purely tech elements — everyone can agree on right protocol, eventually. These things that go beyond and have to do with values.
This afternoon, the United States House of Representatives passed the Digital Accountability and Transparency Act (DATA) of 2013, voting to send S.994, the bill that enjoyed unanimous support in the U.S. Senate earlier this month, on to the president’s desk.
The DATA Act is the most significant open government legislation enacted by Congress in generations, going back to the Freedom of Information Act in 1966. An administration official at the White House Office of Management and Budget confirmed that President Barack Obama will sign the bill into law.
The DATA Act establishes financial open data standards for agencies in the federal government, requires compliance with those standards, and that the data will then be published online. The bipartisan bill was sponsored in the Senate by Senator Rob Portman (R-OH) and Senator Mark Warner (D-VA), and in the House by Representative Darrell Issa (R-CA) and Representative Elijah Cummings (D-MD).
Representative Issa, who first introduced the transparency legislation in 2011, spoke about the bill on the House floor this afternoon and tweeted out a long list of beneficial outcomes his office expects to result from its passage.
#DATAact will give lawmakers & public watchdogs powerful tools to identify and root out waste, fraud & abuse in gov. pic.twitter.com/Y8YvP9ofJU
The Senators who drafted and co-sponsored the version of the bill that the House passed today quickly hailed its passage.
“In the digital age, we should be able to search online to see how every grant, contract and disbursement is spent in a more connected and transparent way through the federal government,” said Senator Warner, in a statement. “Independent watchdogs and transparency advocates have endorsed the DATA Act’s move toward greater transparency and open data. Our taxpayers deserve to see clear, accessible information about government spending, and this accountability will highlight and help us eliminate waste and fraud.”
“During a time of record $17 trillion debt, our bipartisan bill will help identify and eliminate waste by better tracking federal spending,” said Senator Portman, in a statement. “I’m pleased that our bill to empower taxpayers to see how their money is spent and improve federal financial transparency has unanimously passed both chambers of Congress and is now headed to the President’s desk for signature.”
Pleased the House just passed my bipartisan #DataAct.Great news for govt transparency & taxpayers. Now to President for signature. #opengov
“The DATA Act is a transformational piece of legislation that has the potential to permanently transform how the Federal government operates,” said House Majority Leader Eric Cantor, in a statement. “For the first time ever, the American people will have open, standardized access to how the federal government spends their money. Washington has an abundance of information that is often bogged down by federal bureaucracy and is inaccessible to our nation’s innovators, developers and citizens. The standardization and publication of federal spending information in an open format will empower innovative citizens to tackle many of our nation’s challenges on their own. Government of the people, by the people, and for the people should be open to the people.”
The DATA Act earned support from a broad coalition of open government advocates and industry groups. Its passage in Congress was hailed today by open government advocates and trade groups alike.
“The central idea behind the Digital Accountability and Transparency Act is simple: disclose to the public what the federal government spends,” “>said Daniel Schuman, policy council for the Citizens for Responsibility and Ethics in Washington.
“The means necessary to accomplish this purpose—increased agency reporting, the use of modern technology, implementation of government-wide standards, regular quality assurance on the data—will require government to systematically address how it stovepipes federal spending information. This is no small task, and one that is long overdue. The effort to reform transparency around federal spending arose in large part because members of both political parties concluded that their ability to govern effectively depends on making sure federal spending data is comprehensive, accessible, reliable, and timely. Currently, it is not. The leaders of the reform efforts in the Senate are Senators Mark Warner (D-VA), Rob Portman (R-OH), Tom Carper (D-DE), and Tom Coburn (R-OK), and the leaders in the House are Representatives Darrell Issa (R-CA) and Elijah Cummings (D-MD), although they are joined by many others. We welcome and applaud the House of Representative’s passage of the DATA Act. It is a remarkable bill that, if properly implemented, will empower elected officials and everyday citizens alike to follow how the federal government spends money.”
“Sunlight has been advocating for the DATA Act for some time, and are thrilled to see it emerge from Congress,” said Matt Rumsey, a policy analyst at the Sunlight Foundation. “As I wrote while describing the history of the bill after it passed through the Senate, ‘Congress has taken a big step by passing the DATA Act. The challenge now will be ensuring that it is implemented effectively.’ We hope that the President swiftly signs the bill and we look forward to working with his administration to shed more light on federal spending.
“With this legislation, big data is finally coming of age in the federal government,” said Daniel Castro, Director of the Center for Data Innovation, in a statement. “The DATA Act promises to usher in a new era of data-driven transparency, accountability, and innovation in federal financial information. This is a big win for taxpayers, innovators, and journalists.”
“After three years of debate and negotiation over the DATA Act, Congress has issued a clear and unified mandate for open, reliable federal spending data,” said Hudson Hollister, the Executive Director of the Data Transparency Coalition. Hollister helped to draft the first version of the DATA Act in 2011, when he was on Representative Issa’s staff. “Our Coalition now calls on President Obama to put his open data policies into action by signing the DATA Act and committing his Office of Management and Budget to pursue robust data standards throughout federal financial, budget, grant, and contract reporting.”
“The Administration shares Senator Warner’s commitment to government transparency and accountability, and appreciates his leadership in Congress on this issue,” said Steve Posner, spokesman for the White House Office of Management and Budget. “The Administration supports the objectives of the DATA Act and looks forward to working with Congress on implementing the new data standards and reporting requirements within the realities of the current constrained budget environment and agency financial systems.”
Update: Speaker of the House John Boehner (R-OH) signed the DATA Act on April 30, before sending it on to President Obama’s desk.
“From publishing legislative data in XML to live-streaming hearings and floor debates, our majority has introduced a number of innovations to make the legislative process more open and accessible,” he said, in a statement touting open government progress in the House. “With the DATA Act, which I signed today, we’re bringing this spirit of transparency to the rest of the federal government. For years, we’ve been able to track the status of our packages, but to this day there is no one website where you can see how all of your tax dollars are being spent. Once the president signs this bill, that will start to change. There is always more to be done when it comes to opening government and putting power back in the hands of the people, and the House will be there to lead the way.”
UPDATE: On May 9th, 2014, President Barack Obama signed The DATA Act into law.
Statement by Press Secretary Jay Carney:
On Friday, May 9, 2014, the President signed into law:
S. 994, the “Digital Accountability and Transparency Act of 2014” or the “DATA Act,” which amends the Federal Funding Accountability and Transparency Act of 2006 to make publicly available specific classes of Federal agency spending data, with more specificity and at a deeper level than is currently reported; require agencies to report this data on USASpending.gov; create Government-wide standards for financial data; apply to all agencies various accounting approaches developed by the Recovery Act’s Recovery Accountability and Transparency Board; and streamline agency reporting requirements.
Rep. Darrell Issa issued the following statement in response:
“The enactment of the DATA Act marks a transformation in government transparency by shedding light on runaway federal spending,” said Chairman Issa. “The reforms of this bipartisan legislation not only move the federal bureaucracy into the digital era, but they improve accountability to taxpayers and provide tools to allow lawmakers and citizen watchdogs to root out waste and abuse. Government-wide structured data requirements may sound like technical jargon, but the real impact of this legislation on our lives will be more open, more effective government.”
Back in February, I reported that Esri would enable governments to open their data to the public.Today, the geographic information systems (GIS) software giant pushed ArcGIS Open Data live, instantly enabling thousands of its local, state and federal government users to open up the public data in their systems to the public, in just a few minutes.
“Starting today any ArcGIS Online organization can enable open data, specify open data groups and create and publicize their open data through a simple, hosted and best practices web application,” wrote Andrew Turner, chief technology officer of Esri’s Research and Development Center in D.C., in a blog post about the public beta of Open Data ArcGIS. “Originally previewed at FedGIS ArcGIS Open Data is now public beta where we will be working with the community on feedback, ideas, improvements and integrations to ensure that it exemplifies the opportunity of true open sharing of data.”
Turner highlighted what this would mean for both sides of the open data equation: supply and demand.
Data providers can create open data groups within their organizations, designating data to be open for download and re-use, hosting the data on the ArcGIS site. They can also create public microsites for the public to explore. (Example below.) Turner also highlighted the code for Esri’s open-source GeoPortal Server on Github as a means to add metadata to data sets.
Data users, from media to developers to nonprofits to schools to businesses to other government entities, will be able to download data in common open formats, including KML, Spreadsheet (CSV), Shapefile, GeoJSON and GeoServices.
“As the US Open Data Institute recently noted, [imagine] the impact to opening government data if software had ‘Export as JSON’ by default,” wrote Turner.
“That’s what you now have. Users can also subscribe to the RSS feed of updates and comments about any dataset in order to keep up with new releases or relevant supporting information. As many of you are likely aware, the reality of these two perspectives are not far apart. It is often easiest for organizations to collaborate with one another by sharing data to the public. In government, making data openly available means departments within the organization can also easily find and access this data just as much as public users can.”
Turner highlighted what an open data site would look like in the wild:
“Data Driven Detroit a great example of organizations sharing data. They were able to leverage their existing data to quickly publish open data such as census, education or housing. As someone who lived near Detroit, I can attest to the particular local love and passion the people have for their city and state – and how open data empowers citizens and businesses to be part of the solution to local issues.
In sum, this feature could, as I noted in February, could mean a lot more data is suddenly available for re-use. When considered in concert with Esri’s involvement in the White House’s Climate Data initiative, 2014 looks set to be a historic year for the mapping giant.
It also could be a banner year for open data in general, if governments follow through on their promises to release more of it in reusable forms. By making it easy to upload data, hosting it for free and publishing it in the open formats developers commonly use in 2014, Esri is removing three major roadblocks governments face after a mandate to “open up” come from a legislature, city council, or executive order from the governor or mayor’s office.
“The processes in use to publish open data are unreasonably complicated,” said Waldo Jacquith, director of the U.S. Open Data Institute, in an email.
“As technologist Dave Guarino recently wrote, basically inherent to the process of opening data is ETL: “extract-transform-load” operations. This means creating a lot of fragile, custom code, and the prospect of doing that for every dataset housed by every federal agency, 50 states, and 90,000 local governments is wildly impractical.
Esri is blazing the trail to the sustainable way to open data, which is to open it up where it’s already housed as closed data. When opening data is as simple as toggling an “open/closed” selector, there’s going to be a lot more of it. (To be fair, there are many types of data that contain personally identifiable information, sensitive information, etc. The mere flipping of a switch doesn’t address those problems.)
Esri is a gold mine of geodata, and the prospect of even a small percentage of that being released as open data is very exciting.”
Into this mix comes a new report from Friedman Consulting, commissioned by the Ford and MacArthur Foundations. Notably, the report also addresses the deficit of technology talent in the nonprofit sector and other parts of civil society, where such expertise and capacity could make demonstrable improvements to operations and performance. The full 51 page report is well worth reading, for those interested in the topic, but for those limited by time, here are the key findings:
1) The Current Pipeline Is Insufficient: the vast majority of interviewees indicated that there is a severe paucity of individuals with technical skills in computer science, data science, and the Internet or other information technology expertise in civil society and government. In particular, many of those interviewed noted that existing talent levels fail to meet current needs to develop, leverage, or understand technology. 2) Barriers to Recruitment and Retention Are Acute: many of those interviewed said that substantial barriers thwart the effective recruitment and retention of individuals with the requisite skills in government and civil society. Among the most common barriers mentioned were those of compensation, an inability to pursue groundbreaking work, and a culture that is averse to hiring and utilizing potentially disruptive innovators. 3) A Major Gap Between The Public Interest and For-Profit Sectors Persists: as a related matter, interviewees discussed superior for-profit recruitment and retention models. Specifically the for-profit sector was perceived as providing both more attractive compensation (especially to young talent) and fostering a culture of innovation, openness, and creativity that was seen as more appealing to technologists and innovators. 4) A Need to Examine Models from Other Fields: interviewees noted significant space to develop new models to improve the robustness of the talent pipeline; in part, many existing models were regarded as unsustainable or incomplete. Interviewees did, however, highlight approaches from other fields that could provide relevant lessons to help guide investments in improving this pipeline. 5) Significant Opportunity for Connection and Training: despite consonance among those interviewed that the pipeline was incomplete, many individuals indicated the possibility for improved and more systematic efforts to expose young technologists to public interest issues and connect them to government and civil society careers through internships, fellowships, and other training and recruitment tools. 6) Culture Change Necessary: the culture of government and civil society – and its effects on recruitment and other bureaucratic processes – was seen as a
vital challenge that would need to be addressed to improve the pipeline. This view manifested through comments that government and civil society organizations needed to become more open to utilizing technology and adopting a mindset of experimentation and disruption.
And here’s the conclusion:
Based on this research, the findings of the report are clear: technology talent is a key need in government and civil society, but the current state of the pipeline is inadequate to meet that need. The bad news is that existing institutions and approaches are insufficient to build and sustain this pipeline, particularly in the face of
sharp for-profit competition. The good news is that stakeholders interviewed identified a range of organizations and practices that, at scale, have the potential to make an enormous difference. While the problem is daunting, the stakes are high. It will be critical for civil society and government to develop sustainable and
effective pathways for the panoply of technologists and experts who have the skills to create truly 21st century institutions.
For those interested, the New America Foundation will be hosting a forum on the technology deficit in Washington, DC, on April 29th. The event will be livestreamed and archived.
The City of Boston has joined the growing list of cities around the world that have adopted open data. The executive order issued yesterday by Mayor Marty Walsh has been hailed by open government advocates around the country. The move to open up Boston’s data has been followed by action, with 411 data sets listed on data.cityofboston.gov as of this morning. The EO authorizes and requires Boston’s chief information officer to issue a City of Boston Open Data Policy and “include standards for the format and publishing of such data and guidance on accessibility, re-use and minimum documentation for such data.”
The element on re-use is critical: the success of such initiatives should be judged based upon the network effects of open data releases, not the raw amount of data published online, and improvements to productivity, efficiency, city services, accountability and transparency.
Notably, Boston City Councilor-at-Large Michelle Wu also filed a proposal yesterday morning to create an open data ordinance that would require city agencies and departments to make open data available, codifying the executive order into statue as San Francisco, New York City and Philadelphia have done.
“Government today should center on making data-driven decisions and inviting in the public to collaborate around new ideas and solutions,” said Wu, in a statement. “The goal of this ordinance is greater transparency, access, and innovation. We need a proactive, not a reactive, approach to information accessibility and open government.”
Notably, she posted the text of her proposed open data ordinance online on Monday, unlike the city government, and tweeted a link to it. (It took until today for the city of Boston to post the order; city officials have yet to share it on social media. )
“Boston is a world-class city full of energy and talent,” said Wu. “In addition to promoting open government, making information available to the fullest extent possible will help leverage Boston’s energy and talent for civic innovation. From public hackathons to breaking down silos between city departments, putting more data online can help us govern smarter for residents in every neighborhood.”
As long-time readers know, I lived in Boston for a decade. It’s good to see the city government move forward to making the people’s data available to them for use and reuse. I look forward to seeing what the dynamic tech, financial, health care, educational and research communities in the greater Boston area do with it.
EXECUTIVE ORDER OF MAYOR MARTIN J. WALSH
An Order Relative to Open Data and Protected Data Sharing
Whereas, it is the policy of the City of Boston to practice Open Government, favoring participation, transparency, collaboration and engagement with the people of the City and its stakeholders; and
Whereas, information technologies, including web-based and other Internet applications and services, are an essential means for Open Government, and good government generally; and
Whereas, the City of Boston should continue, expand and deepen the City’s innovative use of information technology toward the end of Open Government, including development and use of mobile computing and applications, provision of online data, services and transactions; and
Whereas, the City of Boston also has an obligation to protect some data based upon privacy, confidentiality and other requirements and must ensure that protected data not be released in violation of applicable constraints; and
Whereas, clarification and definition of open data, privacy, security requirements, interoperability and interaction flows is necessary for the City’s Open Government agenda;
NOW THEREFORE, pursuant to the authority vested in me as Chief Executive Officer of the City of Boston by St. 1948, c. 452 Section 11, as appearing in St. 1951, c. 376, Section 1, and every other power hereto enabling, I hereby order and direct as follows:
1. The City of Boston recognizes Open Government as a key means for enabling public participation, transparency, collaboration and effective government, including by ensuring the availability and use of Open Data, appropriate security and sharing of Protected Data, effective use of Identity and Access Management and engagement of stakeholders and experts toward the achievement of Open Government.
2. The City of Boston Chief Information Officer (“CIO”), in consultation with City departments, is authorized and directed to issue a City of Boston Open Data Policy.
a) The Open Data Policy shall include standards for the format and publishing of such data and guidance on accessibility, re-use and minimum documentation for such data;
b) The Open Data Policy shall include guidance for departments on the classification of their data sets as public or protected and a method to report such classification to the CIO. All departments shall publish their public record data sets on the City of Boston open data portal to the extent such data sets are determined to be appropriate for public disclosure, and/or if appropriate, may publish their public record data set through other methods, in accordance with API, format, accessibility and other guidance of the Open Data Policy.
3. The City of Boston CIO, in consultation with City departments, is authorized and directed to issue a City of Boston Protected Data Policy applicable to non-public data, such as health data, educational records and other protected data;
a) The policy shall provide guidance on the management of Protected Data, including guidance on security and other controls to safeguard Protected Data, including appropriate Identity and Access Management and good practice guidelines for compliance with legal or other rules requiring the sharing of Protected Data with authorized parties upon the grant of consent, by operation of law or when otherwise so required;
b) The policy shall provide a method to ensure approval by the Corporation Counsel of the City of Boston to confirm Protected Data is only disclosed in accordance with the Policy.
4. This Executive Order is not intended to diminish or alter the rights or obligations afforded under the Massachusetts Public Records Law, Chapter 66, Section 10 of the Massachusetts General Laws and the exemptions under Chapter 4, Section 7(26). Additionally, this Executive Order is intended to be interpreted consistent with Federal, Commonwealth, and local laws and regulations regarding the privacy, confidentiality, and security of data. Nothing herein shall authorize the disclosure of data that is confidential, private, exempt or otherwise legally protected unless such disclosure is authorized by law and approved by the Corporation Counsel of the City of Boston.
5. This Executive Order is not intended to, and does not, create any right or benefit, substantive or procedural, enforceable at law or in equity by any party against the City of Boston, its departments, agencies, or entities, its officers, employees, or agents, or any other person.
6. The City of Boston CIO is authorized and directed to regularly consult with experts, thought leaders and key stakeholders for the purpose of exploring options for the implementation of policies and practices arising under or related to this Executive Order.
“The federal government can now unlock the collaborative “genius” of citizens and communities to make public services easier to access and understand with a new free social media platform launched by GSA today at the Federal #SocialGov Summit on Entrepreneurship and Small Business,” writes Justin Herman, federal social media manager.
“News Genius, an annotation wiki based on Rap Genius now featuring federal-friendly Terms of Service, allows users to enhance policies, regulations and other documents with in-depth explanations, background information and paths to more resources. In the hands of government managers it will improve public services through citizen feedback and plain language, and will reduce costs by delivering these benefits on a free platform that doesn’t require a contract.”
This could be a significant improvement in making complicated policy documents and regulations understandable to the governed. While plain writing is indispensable for open government and mandated by law and regulation, the practice isn’t exactly uniformly practiced in Washington.
If people can understand more about what a given policy, proposed rule or regulation actually says, they may well be more likely to participate in the process of revising it. We’ll see if people adopt the tool, but on balance, that sounds like a step ahead.
Another recent example comes from DOBTCO founder and CEO Clay Johnson, who memorably put RapGenius to good use last year decoding testimony on Healthcare.gov.
Video of the talk is below, along with the slides I used. You can view all of the videos from the workshop, along with the public plenary on Monday evening, on YouTube or at the workshop page.
Here’s the presentation, with embedded hyperlinks to the organizations, projects and examples discussed:
For more on the “Second Machine Age” referenced in the title, read the new book by Erik Brynjolfsson and Andrew McAfee.