Obama administration announces new initiatives to release and apply open energy data

Posted on May 28, 2014 by Alex

As part of today’s Energy DataPalooza, the White House published a blog post and fact sheet that detailed new initiatives and data releases. Here’s the rundown, all quoted right from the document:

The Department of Energy announced that its Buildings Performance Database has exceeded a milestone of 750,000 building records, making it the world’s largest public database of real buildings’ energy performance information.
The Energy Department launched a SunShot Catalyst prize challenge.
The Department of Energy launched a National Geothermal Data System, a “resource that contains enough raw geoscience data to pinpoint elusive sweet spots of geothermal energy deep in the earth, enabling researchers and commercial developers to find the most promising areas for geothermal energy. Access to this data will reduce costs and risks of geothermal electricity production and, in turn, accelerate its deployment.
The Department of Energy released a study “which identified 65-85 gigawatts of untapped hydropower potential in the United States. Accompanying the release of this report, Oak Ridge National Laboratory has released detailed data resulting from this study.”
Energy Secretary Ernie Moniz announced that WattBuddy won the Department of Energy’s “Apps for Energy” contest, the second part of its year-long American Energy Data Challenge.
The U.S. Environmental Protection Agency (EPA) released the AVoided Emissions and geneRation Tool (AVERT), “a free software tool designed to help state and local air quality planners evaluate county-level emissions displaced at electric power plants by efficiency and renewable energy policies and programs.”
7 new utilities and state-wide energy efficiency programs adopted the Green Button standard, including Seattle City Light, Los Angeles Department of Water and Power, Green Mountain Power, Wake Electric, Hawaiian Electric Company, Maui Electric Company, Hawai’i Electric Light Company, and Hawaii Energy.
Pivotal Labs collaborated with NIST and EnergyOS to create OpenESPI, an open source implementation of the Green Button standard.
7 electric utilities “agreed to the development and use of a voluntary open standard for the publishing of power outage and restoration information. The commitment of utilities to publish their already public outage information as a structured data in an easy-to-use and common format, in a consistent location, will make it easier for a wide set of interested parties—including first responders, public health officials, utility operations and mutual assistance efforts, and the public at large—to make use of and act upon this important information, especially during times of natural disaster or crisis.” iFactor Consulting will support it and, notably, Google will use the data in its Crisis Maps.
Philadelphia, San Francisco and Washington D.C. will use the Department of Energy’s open source Standard Energy Efficiency Data (SEED) platform to publish data collected through benchmarking disclosure of building energy efficiency.

Harvard Law study finds Supreme Court editing its decisions without notice

Posted on May 25, 2014 by Alex

This morning, Adam Liptak reported at the New York Times that the Supreme Court has been quietly editing its legal decisions without notice or indication. According to Richard J. Lazarus, a law professor at Harvard Liptak interviewed about a new study examining the issue, these revisions include “truly substantive changes in factual statements and legal reasoning.”

The court does warn readers that early versions of its decisions, available at the courthouse and on the court’s website, are works in progress. A small-print notice says that “this opinion is subject to formal revision before publication,” and it asks readers to notify the court of “any typographical or other formal errors.”

But aside from announcing the abstract proposition that revisions are possible, the court almost never notes when a change has been made, much less specifies what it was. And many changes do not seem merely typographical or formal.

Four legal publishers are granted access to “change pages” that show all revisions. Those documents are not made public, and the court refused to provide copies to The New York Times.

The Supreme Court secretly editing the legal record seems like a big deal to me. (Lawyers, professors, court reporters, tell me I’m wrong!)
To me, this story highlights the need for and, eventually the use of data and software to track the changes in a public, online record of Supreme Court decisions.

Static PDFs that are edited without notice, data or indication of changes doesn’t seem good enough for the legal branch of a constitutional republic in the 21st century.

Just as the U.S. Code, state and local codes, are being constantly being updated and consulted by lawyers, courts and the people, the Supreme Court’s decisions could be published and maintained online as a body of living legislation at SupremeCourt.gov so that they may be read and consulted by all.

Embedded and integrated into those decisions and codes would be a record of the changes to them, the “meta data” of the actions of the legislative organ of the republic.

What you’ll find now at SupremeCourt.gov is a significant improvement over past years. Future versions, however, might be even better.

District of Columbia to experiment with collaborative lawmaking online

Posted on May 16, 2014 by Alex

Residents of the District of Columbia now have a new way to comment on proposed legislation before the City Council, MadisonDC. Today, David Grosso, a DC Councilman-at-Large, introduced the new initiative to collaboratively draft laws online in a release and video on YouTube.

“As we encourage more public engagement in the legislative process, I hope D.C. residents will take a moment to log onto the Madison project,” said Councilmember Grosso. “I look forward to seeing the public input on my proposed bills.”

MadisonDC has its roots in the first Congressional hackathon, back in 2011. The event spawned a beta version of the Madison Project, an online platform to where lawmakers could crowdsource legislative markup. It was deployed first by the office of Representative Darrell Issa, crowdsourcing comments on several bills. The code was subsequently open sourced and now has been deployed by the OpenGov Foundation as a way to publish municipal codes online, along with other uses.

“We are excited to support Councilmember Grosso’s unprecedented efforts to welcome residents – and their ideas – directly into the local lawmaking process,” said Seamus Kraft, co-founder & executive director of The OpenGov Foundation, on the nonprofit organization’s blog. “But what really matters is that we’re going to produce better City Council bills, with fewer frustrations and unintended consequences. These three bills are only a start. The ultimate goal of MadisonDC is transforming D.C.’s entire policymaking machine for the Internet Age, creating an end-to-end, on-demand collaboration ecosystem for both citizens and city officials. The possibilities are limitless.”

The first three bills on MadisonDC are the D.C. Urban Farming and Food Security Act of 2014, the Marijuana Legalization and Regulation Act of 2013, and the Open Primary Elections Amendment Act of 2014.

The DC Open Government Office at the city’s Board of Ethics and Government Accountability, commended the effort with a tweet:

DCs #OpenGov office commends @cmdgrosso @FoundOpenGov 4 launch of #MadisonDC-free tool 4 citizen participation in gov http://t.co/5ISqx0LFMW

— DCOPENGOV (@DCOPENGOV) May 16, 2014

Councilman Grosso further engaged the public on Twitter this afternoon, inviting public comment on his proposed legislation.

@maarston Here's the link, please add these comments to the bill directly! http://t.co/zDel2fYQSk Thanks for your interest! #opengov

— David Grosso (@cmdgrosso) May 16, 2014

This post has been updated to include more statements and social media updates.

FAQ on Net Neutrality and FCC NPRM on Proposed Open Internet Rules

Posted on May 15, 2014 by Alex

This morning, the Federal Communications Commission (FCC) voted 3-2 to approve a Notice of Proposed Rulemaking (NPRM) on Open Internet Rules.

[Hearing Video] [Statements: Wheeler | Clyburn | Rosenworcel | Pai | O’Rielly]

The commission has released a fact sheet on the rules to the media and published the release online as a .doc, .pdf, HTML and .txt. I’ve published the FCC FAQ below and linked to the NPRM on Protecting and Promoting the Open Internet.

The FCC asks the public several important question in this NPRM:

Should the FCC should bar paid prioritization completely?
Should the FCC apply Open Internet rules to mobile broadband Internet service, not just fixed broadband Internet?
Should the FCC reclassify broadband Internet service as a telecommunications service under Title II of the Telecommunications Act?

For more background on net neutrality, read:

Brian Fung at the Washington Post
Gautham Nagesh at the Wall Street Journal, including this net neutrality primer
Stacey Higgenbotham at GigaOm on the problem with this network neutrality compromise, including her useful timeline of net neutrality policy at the FCC, going back to 2004 when then-FCC Chairman Michael Powell gave a speech outlining four “Internet Freedoms.”
Jon Brodkin‘s analysis of the law of the land at Ars Technica
Mark Coddington’s excellent digest at the Nieman Lab, which provides even more context for the origins of net neutrality and what’s next over the rest of the year:

Earlier in the week, The Wall Street Journal reported that Wheeler was planning on revising his proposed rules to ensure that non-paying companies’ content wouldn’t be put at an unfair disadvantage. Stanford’s Barbara van Schewick and Morgan Weiland called the reported revisions a good start, with a long way yet to go. In a Twitter chat, an FCC representative gave some more details about the proposal and said the commission is still considering regulating Internet access like a utility. Columbia professor Tim Wu and TechFreedom’s Berin Szoka debated that prospect in The Wall Street Journal.

Opposition to the plan hasn’t reached a fever pitch, but it is building. Quartz’s Yitz Jordan looked at the way net neutrality has united leftist and corporate tech interests, and The New York Times’ David Carr said he’s betting on the Silicon Valley powers aligning against the FCC plan over the Beltway establishment backing the proposals. The New York Times’ profiled an intellectual leader of the net neutrality movement, Columbia’s Tim Wu, and Time and CNET talked to one of the most prominent voices against the plan in Washington, Sen. Al Franken.

As of May 15, the FCC Open Internet docket has received 21,017 comments. More are sure to pour in over the next four months, given the FCC’s controversial proposal to allow paid prioritization.

You can see other comments and submit your own comments at Docket 14-28 or email them to openinternet@fcc.gov. Your email address will then become part of the Open Internet Rule docket.

Update: The White House released a statement on network neutrality from the press secretary:

The President has made clear since he was a candidate that he strongly supports net neutrality and an open Internet. As he has said, the Internet’s incredible equality – of data, content, and access to the consumer – is what has powered extraordinary economic growth and made it possible for once-tiny sites like eBay or Amazon to compete with brick and mortar behemoths.

The FCC is an independent agency, and we will carefully review their proposal. The FCC’s efforts were dealt a real challenge by the Court of Appeals in January, but Chairman Wheeler has said his goal is to preserve an open Internet, and we are pleased to see that he is keeping all options on the table. We will be watching closely as the process moves forward in hopes that the final rule stays true to the spirit of net neutrality.

The President is looking at every way to protect a free and open Internet, and will consider any option that might make sense.

FACT SHEET: Protecting and Promoting the Open Internet

May 15, 2014

The Internet is a vital platform for innovation, economic growth and free expression in America. And yet, despite two prior FCC attempts, there are no rules on the books to prevent broadband providers from limiting Internet openness by blocking content or discriminating against consumers and entrepreneurs online. The “Protecting and Promoting the Open Internet” Notice of Proposed Rulemaking (NPRM) begins the process of closing that gap, which was created in January 2014 when the D.C. Circuit struck down key FCC Open Internet rules.

This Notice seeks public comment on the benefits of applying Section 706 of the Telecommunications Act of 1996 and Title II of the Communications Act, including the benefits of one approach over the other, to ensure the Internet remains an open platform for innovation and expression. While the Notice reflects a tentative conclusion that Section 706 presents the quickest and most resilient path forward per the court’s guidance, it also makes clear that Title II remains a viable alternative and asks specifically which approach is better. In addition, the proposal asks whether paid prioritization arrangements, or “fast lanes,” can be banned outright.

We Are Listening: An Extended Four-Month Public Comment Period is Open

Since February, tens of thousands of Americans have offered their views to the Commission on how to protect an Open Internet. The proposal reflects the substantial public input we have received. The Commission wants to continue to hear from Americans across the country throughout this process. An extended four-month public comment period on the Commission’s proposal will be opened on May 15 – 60 days (until July 15) to submit initial comments and another 57 days (until September 10) for reply comments.

The NPRM seeks comment on a number of questions designed to:

Develop the Strongest Legal Framework for Enforceable Rules of the Road

Reflects the principles that Chairman Wheeler outlined in February, including using the Section 706 blueprint for restoring the Open Internet rules offered by the D.C. Circuit in its decision in Verizon v. FCC, which relies on the FCC’s legal authority under Section 706 of the Telecommunications Act of 1996. At the same time, the Commission will seriously consider the use of Title II of the Communications Act as the basis for legal authority.
Seeks comment on the benefits of both Section 706 and Title II, including the benefits of one approach over the other to ensure the Internet remains an open platform for innovation and expression.
Explores other available sources of legal authority, including also Title III for wireless services. The Commission seeks comment on the best ways to define, prevent, expose and punish the practices that threaten an Open Internet.

Ensure choices for consumers and opportunity for innovators

Proposes a requirement that all users must have access to fast and robust service: Broadband consumers must have access to the content, services and applications they desire. Innovators and edge providers must have access to end-users so they can offer new products and services.
Considers ensuring that these standards of service evolve to keep pace with of innovation.

Prevent practices that can threaten the Open Internet

Asks if paid prioritization should be banned outright.
Promises clear rules of the road and aggressive enforcement to prevent unfair treatment of consumers, edge providers and innovators.
Includes a rebuttable presumption* that exclusive contracts that prioritize service to broadband affiliates are unlawful.

(*Rebuttable presumption is a presumption that is taken to be true unless someone comes forward to contest it and proves otherwise)

Expand transparency

Enhance the transparency rules to provide increased and specific information about broadband providers’ practices for edge providers, consumers.
Asks whether broadband providers should be required to disclose specific network practices, performance characteristics (e.g., effective upload and download speeds, latency and packet loss) and/or terms and conditions of service to end users (e.g., data caps).
Tentatively concludes that broadband providers should disclose “meaningful information” about the service, including (1) tailored disclosures to end users, (2) congestion that may adversely impact the experience of end users, including at interconnection points, and (3) information about new practices, like any paid prioritization, to the extent that it is otherwise permitted.

Protect consumers, innovators and startups through new rules and effective enforcement

Proposes the creation of an ombudsperson with significant enforcement authority to serve as a watchdog and advocate for start-ups, small businesses and consumers.
Seeks comment on how to ensure that all parties, and especially small businesses and start-ups, have effective access to the Commission’s dispute resolution and enforcement processes.
Considers allowing anonymous reporting of violations to alleviate fears by start-ups of retribution from broadband providers.

Consider the Impact on the Digital Divide: Ensuring access for all communities

Considers the impact of the proposals on groups who disproportionately use mobile broadband service.
Asks whether any parts of the nation are being left behind in the deployment of new broadband networks, including rural America and parts of urban America.

Link to Chairman Wheeler’s February Open Internet framework: http://www.fcc.gov/document/statement-fcc-chairman-tom-wheeler-fccs-open-internet-rules

Comment on the Open Internet proposals: http://www.fcc.gov/comments

This post has been updated with addition statements and revised as the FCC put more documents online.

[FAQ] How do I download a tax transcript from IRS.gov?

Posted on May 12, 2014 by Alex

UPDATE: This service was taken offline after IRS security was compromised.

In January 2014, the IRS quietly introduced a new feature at IRS.gov that enabled Americans to download their tax transcript over the Internet. Previously, filers could request a copy of the transcript (not the full return) but had to wait 5-10 business days to receive it in the mail. For people who needed more rapid access for applications, the delay could be critical.

What’s a tax transcript?

It’s a list of the line items that you entered onto your federal tax return (Form 1040), as it was originally filed to the IRS.

Wait, we couldn’t already download a transcript like this in 2014?

Nope. Previously, filers could request a copy of the transcript (not the full return) but they would have to wait 5-10 business days to receive it in the mail.

Why did this happen now?

The introduction of the IRS feature coincided with a major Department of Education event focused on opening up such data. A U.S. Treasury official said that the administration was doing that to make it “easier for student borrowers to access tax records he or she might need to submit loan applications or grant applications.”

Why would someone want their tax transcript?

As the IRS itself says, “IRS transcripts are often used to validate income and tax filing status for mortgage applications, student and small business loan applications, and during tax preparation.” It’s pretty useful.

OK, so what do I do to download my transcript?

Visit “get transcript” and register online. You’ll find that the process is very similar to setting up online access for a bank accounts. You’ll need to choose a pass phrase, pass image and security questions, and then answer a series of questions about your life, like where you’ve lived. If you write them down, store them somewhere safe and secure offline, perhaps with your birth certificate and other sensitive documents.

Wait, what? That sounds like a lot of of private information.

True, but remember: the IRS already has a lot of private data about you. These questions are designed to prevent someone else from setting up a fake account on your behalf and stealing it from them. If you’re uncomfortable with answering these questions, you can request a print version of your transcript. To do so, you’ll need to enter your Social Security number, data of birth and street address online. If you’re still uncomfortable doing so, you can visit or contact the IRS in person.

So is this safe?

It’s probably about as safe as doing online banking. Virtually nothing you do online is without risk. Make sure you 1) go to the right website 2) connect securely and 3) protect the transcript, just as you would paper tax records. Here’s what the IRS told me about their online security:

“The IRS has made good progress on oversight and enhanced security controls in the area of information technology. With state-of-the-art technology as the foundation for our portal (e.g. irs.gov), we continue to focus on protecting the PII of all taxpayers when communicating with the IRS.

However, security is a two-way street with both the IRS and users needing to take steps for a secure experience. On our end, our security is comparable to leaders in private industry.

Our IRS2GO app has successfully completed a security assessment and received approval to launch by our cybersecurity organization after being scanned for weaknesses and vulnerabilities.

Any personally identifiable information (PII) or sensitive information transmitted to the IRS through IRS2Go for refund status or tax record requests uses secure communication channels that meet or exceed federal requirements for encryption. No PII is passed back to the taxpayer through IRS2GO and no PII is stored on the smartphone by the application.

When using our popular “Where’s My Refund?” application, taxpayers may notice just a few of our security measures. The URL for Where’s My Refund? begins with https. Just like in private industry, the “s” is a key indicator that a web user should notice indicating you are in a “secure session.” Taxpayers may also notice our message that we recommend they close their browser when finished accessing your refund status.

As we become a more mobile society and able to link to the internet while we’re on the go, we remind taxpayers to take precautions to protect themselves from being victimized, including using secure networks, firewalls, virus protection and other safeguards.

We always recommend taxpayers check with the Federal Trade Commission for the latest on reporting incidents of identity theft. You can find more information on our website, including tips if you believe you have become the victim of identity theft.”

What do I do with the transcript?

If you download tax transcripts or personal health information to a mobile device, laptop, tablet or desktop, install passcodes and full disk encryption, where available, on every machine its on. Leaving your files unprotected on computers connected to the Internet is like leaving the door to your house unlocked with your tax returns and medical records on the kitchen table.

I got an email from the IRS that asks me to email them personal information to access my transcript. Is this OK?

Nope! Don’t do it: it’s not them. The new functionality will likely inspire criminals to create mockups of the government website that look similar and then send phishing emails to consumers, urging them to “log in” to fake websites. You should know that IRS “does not send out unsolicited e-mails asking for personal information.” If you receive such an email, consider reporting the phishing to the IRS. Start at www.irs.gov/Individuals/Get-Transcript every time.

I tried to download my transcript but it didn’t work. What the heck?

You’re not alone. I had trouble using an Apple computer. Others have had technical issues as well.

Here’s what the IRS told me: “As a web application Get Transcript is supported on most modern OS/browser combinations. While there may be intermittent issues due to certain end-user configurations, IRS has not implemented any restrictions against certain browsers or operating systems. We are continuing to work open issues as they are identified and validated.”

A side note: For the best user experience, taxpayers may want to try up-to-date versions of Internet Explorer and a supported version of Microsoft Windows; however, that is certainly not a requirement.)”

What does that mean, in practice? That not all modern OS/browser combinations are supported, potentially including OS X and Android, that the IRS digital staff knows it — although they aren’t informing IRS.gov users regarding what versions of IE, Windows or other browsers/operating systems are presently supported and what is not — and are working to improve.

Unfortunately, ongoing security issues with Internet Explorer means that in 2014, we have the uncomfortable situation where the Department of Homeland Security is recommending that people avoid using Internet Explorer while the IRS recommends that its customers choose it for the “best experience.”

Given the comments from frustrated users, the IRS could and should do better on all counts.

Will I be able to file my tax return directly to the government through IRS.gov now?

You can already file your federal tax return online. According to the IRS, almost 120 million people used IRS e-file last year.

Well, OK, but shouldn’t having a user account and years of returns make it easier to file without a return at all?

It could. As you may know, other countries already have “return-free filing,” where a taxpayer can go online, login and access a pre-populated tax return, see what the government estimates her or she owes, make any necessary adjustments, and file.

Wait, that sounds pretty good. Why doesn’t the USA have return-free filing yet?

Yes, it does. As ProPublica reported last year, “the concept has been around for decades and has been endorsed by both President Ronald Reagan and a campaigning President Obama.”

As ProPublica reported last year, both H&R Block and Intuit, the maker of TurboTax, have lobbied against free and simple tax filing in Washington, given that it’s in their economic self-interest to do so:

In its latest annual report filed with the Securities and Exchange Commission, however, Intuit also says that free government tax preparation presents a risk to its business. Roughly 25 million Americans used TurboTax last year, and a recent GAO analysis said the software accounted for more than half of individual returns filed electronically. TurboTax products and services made up 35 percent of Intuit’s $4.2 billion in total revenues last year. Versions of TurboTax for individuals and small businesses range inprice from free to $150.

What are the chances return-free filing could be on IRS.gov soon?

Hard to say, but the IRS told me that something that sounds like a precursor to return-free filing is on the table. According to the agency, “the IRS is considering a number of new proposals that may become a part of the online services roadmap some time in the future. This may include a taxpayer account where up to date status could be securely reviewed by the account owner.”

Creating the ability for people to establish secure access to IRS.gov to review and download tax transcripts is a big step in that direction. Whether the IRS takes any more steps soon is more of a political and policy question than a technical one, although the details of the latter matter.

Is the federal government offering other services like this for other agencies or personal data?

The Obama administration has been steadily modernizing government technology, although progress has been uneven across agencies. While the woes of Healthcare.gov attracted a lot of attention, many federal agencies have improved how they deliver services over the Internet. One of the themes of the administration’s digital government approach is “smart disclosure,” a form of targeted transparency in which people are offered the opportunity to download their own data, or data about them, from government or commercial services. The Blue Button is an example of this approach that has the potential to scale nationally.

U.S. publishes new “Open Data Action Plan,” announces new data releases

Posted on May 9, 2014 by Alex

On the one year anniversary of President Barack Obama’s historic executive order to open up more government data, U.S. chief information officer Steven VanRoekel and U.S. chief technology officer Todd Park described “continued progress and plans for open government data” at the WhiteHouse.gov blog:

Freely available data from the U.S. government is an important national resource, serving as fuel for entrepreneurship, innovation, scientific discovery, and economic growth. Making information about government operations more readily available and useful is also core to the promise of a more efficient and transparent government. This initiative is a key component of the President’s Management Agenda and our efforts to ensure the government is acting as an engine to expand economic growth and opportunity for all Americans. The Administration is committed to driving further progress in this area, including by designating Open Data as one of our key Cross-Agency Priority Goals.

Over the past few years, the Administration has launched a number of Open Data Initiatives aimed at scaling up open data efforts across the Health, Energy, Climate, Education, Finance, Public Safety, and Global Development sectors. The White House has also launched Project Open Data, designed to share best practices, examples, and software code to assist federal agencies with opening data. These efforts have helped unlock troves of valuable data—that taxpayers have already paid for—and are making these resources more open and accessible to innovators and the public.

Other countries are also opening up their data. In June 2013, President Obama and other G7 leaders endorsed the Open Data Charter, in which the United States committed to publish a roadmap for our nation’s approach to releasing and improving government data for the public. Building upon the Administration’s Open Data progress, and in fulfillment of the Open Data Charter, today we are excited to release the U.S. Open Data Action Plan.

The new Open Data Action Plan (which was, ironically, released as a glossy PDF*, as opposed to a more machine-readable format) details a number of significant steps, including:

Many releases of new data and improved access to existing databases. These include more climate data, adding an API to Smithsonian artwork and the Small Business Administration’s database of suppliers and making data available for re-use. *Late in the day, with a “thanks to the open data community for their vigilance,” The White House posted the list of “high value data sets” in the plan as a .CSV.
A roadmap with deadlines for the release of these datasets over the course of 2014-2015. Some data releases are already online, like Medicare physician payment data. I’ve created an online spreadsheet that should act as a dashboard for U.S. National Open Data Action Plan Deadlines.
A policy that “new data sets will be prioritized for release based on public feedback.“
New open data projects at federal agencies, each of which will be led by a Presidential Innovation Fellow. According to the plan, the agencies will include NOAA, the Census Bureau, NASA, IRS, Interior, Labor, Energy and HHS.

Compliance with the executive order on open data has been mixed, as the Sunlight Foundation detailed last December. While all executive branch agencies were required to develop a machine-readable catalog of their open data at [department].gov/data.json and stand up /developer pages, it took until February 2014 for all Cabinet agencies to publish their open data inventories. (The government shutdown was a factor in the delay.)

The federal government’s progress on this open data action plan is likely to be similar, much as it has been for the past five years under the Obama administration: variable across agencies, with delays in publishing, issues in quality and carve outs for national security, particularly with respect to defense and intelligence agencies. That said, progress is progress: many of the open data releases detailed in the plan have already occurred.

If the American people, press, Congress and public worldwide wish to see whether the administration is following through on some of its transparency promises, they can do so by visiting agency websites and the federal open data repository, Data.gov, which will celebrate its fifth anniversary next week.

Former New York City mayor Mike Bloomberg is fond of quoting William Edwards Deming: “In God we trust. All others bring data.” Given historic lows in trust in government, the only way the Obama administration will make progress on that front is if they actually release more of it.

[Image Credit: Eric Fischer/Flickr]

From broadband maps to Data.gov, WordPress looks to power more open source government

Posted on May 7, 2014 by Alex

#OSS RT @nickgernert: Inaugural WordPress for Government & Enterprise event kicks off with @photomatt & @digiphile pic.twitter.com/4c7Htlqg5k

— Alex Howard (@digiphile) May 6, 2014

I had a blast interviewing Matt Mullenweg, the co-creator of WordPress and CEO of Automattic, last night at the inaugural WordPress and government meetup in DC. UPDATE: Video of our interview and the Q&A that followed is embedded below:

WordPress code powers some 60 million websites, including 22% of the top 10 million sites on the planet and .gov platforms like Broadbandmap.gov. Mullenweg was, by turns, thoughtful, geeky and honest about open source and giving hundreds of millions of people free tools to express themselves, along with quietly principled, with respect to the corporate values for an organization spread between 35 countries, government censorship and the ethics of transparency.

60% of Web doesn’t use a CMS, says @photomatt. 78% of top 10MM websites aren’t on @WordPress. Hopes to change both. pic.twitter.com/mMqv15l1ba

— Alex Howard (@digiphile) May 7, 2014

After Mullenweg finished taking questions from the meetup, Data.gov architect Philip Ashlock gave a presentation on how the staff working on the federal government’s open data platform are using open source software to design, build, publish and collaborate, from WordPress to CKAN to Github issue tracking.

We’re supporting a government-wide effort to manage data as an asset, says @philipashlock, of http://t.co/Djm3iKByc7 pic.twitter.com/FeSFiJWf5h

— Alex Howard (@digiphile) May 7, 2014

Making private issue tracking at @usdatagov public: http://t.co/t6uXoImA6R #opengov #oss pic.twitter.com/zP6G6IIGc7

— Alex Howard (@digiphile) May 7, 2014

http://t.co/9LvdNKigf1 provides “WordPress-as-a-Service” to the U.S. federal government, says @philipashlock #oss pic.twitter.com/lQUXD8tQPQ

— Alex Howard (@digiphile) May 7, 2014

United States federal government use of crowdsourcing grows six-fold since 2011

Posted on May 7, 2014 by Alex

Citizensourcing and open innovation can work in the public sector, just as crowdsourcing can in the private sector. Around the world, the use of prizes to spur innovation has been booming for years. The United States of America has been significantly scaling up its use of prizes and challenges to solving grand national challenges since January 2011, when, President Obama signed an updated version of the America COMPETES Act into law.

According to the third congressionally mandated report released by the Obama administration today (PDF/Text), the number of prizes and challenges conducted under the America COMPETES Act has increased by 50% since 2012, 85% since 2012, and nearly six-fold overall since 2011. 25 different federal agencies offered prizes under COMPETES in fiscal year 2013, with 87 prize competitions in total. The size of the prize purses has also grown as well, with 11 challenges over $100,000 in 2013. Nearly half of the prizes conducted in FY 2013 were focused on software, including applications, data visualization tools, and predictive algorithms. Challenge.gov, the award-winning online platform for crowdsourcing national challenges, now has tens of thousands of users who have participated in more than 300 public-sector prize competitions. Beyond the growth in prize numbers and amounts, Obama administration highlighted 4 trends in public-sector prize competitions:

New models for public engagement and community building during competitions
Growth software and information technology challenges, with nearly 50% of the total prizes in this category
More emphasis on sustainability and “creating a post-competition path to success”
Increased focus on identifying novel approaches to solving problems

The growth of open innovation in and by the public sector was directly enabled by Congress and the White House, working together for the common good. Congress reauthorized COMPETES in 2010 with an amendment to Section 105 of the act that added a Section 24 on “Prize Competitions,” providing all agencies with the authority to conduct prizes and challenges that only NASA and DARPA has previously enjoyed, and the White House Office of Science and Technology Policy (OSTP), which has been guiding its implementation and providing guidance on the use of challenges and prizes to promote open government.

“This progress is due to important steps that the Obama Administration has taken to make prizes a standard tool in every agency’s toolbox,” wrote Cristin Dorgelo, assistant director for grand challenges in OSTP, in a WhiteHouse.gov blog post on engaging citizen solvers with prizes:

In his September 2009 Strategy for American Innovation, President Obama called on all Federal agencies to increase their use of prizes to address some of our Nation’s most pressing challenges. Those efforts have expanded since the signing of the America COMPETES Reauthorization Act of 2010, which provided all agencies with expanded authority to pursue ambitious prizes with robust incentives.

To support these ongoing efforts, OSTP and the General Services Administration have trained over 1,200 agency staff through workshops, online resources, and an active community of practice. And NASA’s Center of Excellence for Collaborative Innovation (COECI) provides a full suite of prize implementation services, allowing agencies to experiment with these new methods before standing up their own capabilities.

Sun Microsystems co-founder Bill Joy famously once said that “No matter who you are, most of the smartest people work for someone else.” This rings true, in and outside of government. The idea of governments using prizes like this to inspire technological innovation, however, is not reliant on Web services and social media, born from the fertile mind of a Silicon Valley entrepreneur. As the introduction to the third White House prize report notes:

“One of the most famous scientific achievements in nautical history was spurred by a grand challenge issued in the 18th Century. The issue of safe, long distance sea travel in the Age of Sail was of such great importance that the British government offered a cash award of £20,000 pounds to anyone who could invent a way of precisely determining a ship’s longitude. The Longitude Prize, enacted by the British Parliament in 1714, would be worth some £30 million pounds today, but even by that measure the value of the marine chronometer invented by British clockmaker John Harrison might be a deal.”

Centuries later, the Internet, World Wide Web, mobile devices and social media offer the best platforms in history for this kind of approach to solving grand challenges and catalyzing civic innovation, helping public officials and businesses find new ways to solve old problem. When a new idea, technology or methodology that challenges and improves upon existing processes and systems, it can improve the lives of citizens or the function of the society that they live within.

“Open innovation or crowdsourcing or whatever you want to call it is real, and is (slowly) making inroads into mainstream (i.e. non high-tech) corporate America,” said MIT principal research professor Andrew McAfee, in an interview in 2012.” P&G is real. Innocentive is real. Kickstarter is real. Idea solicitations like the ones from Starbucks are real, and lead-user innovation is really real.”

Prizes and competitions all rely upon the same simple idea behind the efforts like the X-Prize: tapping into the distributed intelligence of humans using a structured methodology. This might include distributing work, in terms of completing a given task or project, or soliciting information about how to design a process, product or policy.

Over the past decade, experiments with this kind of civic innovation around the world have been driven by tight budgets and increased demands for services, and enabled by the increased availability of inexpensive, lightweight tools for collaborating with connected populations. The report claimed that crowdsourcing can save federal agencies significant taxpayer dollars, citing an example of a challenge where the outcome cost a sixth of the estimated total of a traditional approach.

One example of a cost-effective prize program is the Medicaid Provider Screening Challenge that was offered by the Centers for Medicare & Medicaid Services (CMS) as part of a pilot designed in partnership with states and other stakeholders. This prize program was a series of software development challenges designed to improve capabilities for streamlining operations and screening Medicaid providers to reduce fraud and abuse. With a total prize purse of $500,000, the challenge series is leading to the development of an open source multi-state, multi-program provider screening shared-service software program capable of risk scoring, credential validation, identity authentication, and sanction checks, while lowering the burden on providers and reducing administrative and infrastructure expenses for states and Federal programs. CMS partnered with the NASA Center of Excellence for Collaborative Innovation (COECI), NASA’s contractor Harvard Business School, Harvard’s subcontractor TopCoder, and the State of Minnesota. The State of Minnesota is working on full deployment of the software, and CMS is initiating a campaign to encourage other states to leverage the software. COECI estimates that the cost of designing and building the portal through crowdsourcing was one-sixth of what the effort would have cost using traditional software development methods. Through the success of this and subsequent
challenges, CMS is attempting to establish a new paradigm for crowdsourcing state and Federal information technology (IT) systems in a low-cost, agile manner by opening challenges to new players, small companies, and talented individual developers to build solutions which can “plug and play” with existing legacy systems or can operate in a shared, cloud-based environment.

As is always the nature of experiments, many early attempts failed. A few have worked and subsequently grown into sustainable applications, services, data sources, startups, processes and knowledge that can be massively scaled. Years ago, Micah Sifry predicted that the “gains from enabling a culture of open challenges, outsider innovation and public participation” in government were going to be huge. He was right.

Linked below are the administration’s official letters to the House and Senate, reporting the results of last year’s prizes.

COMPETES FY2013PrizesReport HOUSE Letter (PDF)/COMPETES FY2013PrizesReport HOUSE Letter (Text)

COMPETES FY2013 PrizesReport SENATE Letter (PDF)/COMPETES FY2013 PrizesReport SENATE Letter (Text)

[Image Credit: NASA SDO. Context: Solar flare predictive algorithm challenge]

Maryland Governor Martin O’Malley asks Reddit to ‘Ask Me Anything’

Posted on May 5, 2014 by Alex

I generally agree with the assessment of Washington Post, with respect to how well Maryland Governor Martin O’Malley’s “Ask Me Anything” on Reddit went for him, though I give him far more credit for venturing onto the unruly social news platform than the reporter did. The Post’s report that he only answered 5 questions was just plain incorrect.

O’Malley answered 19 questions this morning, not 5, a fact that could be easily and quickly ascertained by clicking on GovMartinOMalley, the username he used for the AMA, including a (short) answer to a question on mental health that the Post said went unanswered. (An editor made multiple corrections and updates to the Post’s story after I pointed that out.)

He subsequently logged back on in the afternoon to answer more questions, rebutting the Post’s assessment and that of a user: “I don’t know, I’m having fun! This is my first AMA. I had to step away to sign a bunch of bills, and I’m glad to be back,” he commented.

He answered at least one tough question (from a questioner who appears to have joined Reddit today) after doing so, although the answer hasn’t been highly rated:

@bmoreprogressive91: Thanks for doing an AMA. Just one question: How does the Maryland healthcare exchange, which cost taxpayers $90 million to implement before your administration found that it would be cheaper (at an additional $40-50 million) to just replace it than to fix it, show that your Administration has been effectively using taxpayer dollars to better the lives of individual citizens?

http://www.washingtonpost.com/local/md-politics/md-spent-90-million-on-health-exchange-technology-according-to-cost-breakdown/2014/04/18/5f2e7600-c722-11e3-8b9a-8e0977a24aeb_story.html

O’Malley: No one was more frustrated than I was about the fact that our health exchange website didn’t work properly when we launched. But our health exchange is more than a web site, and we worked hard to overcome the technical problems. We have enrolled about 329,000 people thus far, exceeding the goal we set of 260,000. I often say that we haven’t always succeeded at first, but we have never given up. We learn from both success and failure.

By the end of the day, Maryland’s governor answered 36 questions in total. (You can read a cleanly formatted version of O’Malley’s AMA at Interview.ly). Reddit users rated the quality of some answers much higher than others, with the most popular answer, “Yes,” coming in response to whether he would support a constitutional amendment to reverse the Citizens United decision by the Supreme Court.

To be fair — and reasonable observers should be — Reddit’s utility for extracting answers from a politician isn’t so great, as Alexis Madrigal pointed out after President Barack Obama did an AMA, back in 2012. That said, I’m generally supportive of elected leaders engaging directly with constituents online using the tools and platforms that citizens are active upon themselves.

Popular questions that go unanswered can be instructive and offer some insight into what issues a given politician would rather not talk about in public. As such, they’re fine fodder for media to report upon. The record online, however, also means that when a reporter botches the job or misrepresents an interaction, question or answer, we can all see that, too.

Postscript: Andrew MacRae was critical of the governor and his team’s approach to Reddit and offered a tip for other politicians that venture onto the social news platform for an AMA. More on that in the embedded tweets, below:

@reddit @GovernorOMalley That would undermine the point of an AMA @digiphile . His team should have studied @Schwarzenegger for how to AMA

— Andrew MacRae (@IAmAMacRae) May 5, 2014

@digiphile @reddit recipe is simple. answer the tough q's immediately. bring data. demonstrate personality, it's not a speech it's a convo

— Andrew MacRae (@IAmAMacRae) May 5, 2014

This post was further updated after the Governor went back online in the afternoon.

[Image Credit: Governor O’Malley]

PCAST report on big data and privacy emphasizes value of encryption, need for policy

Posted on May 1, 2014 by Alex

This week, the President’s Council of Advisors on Science and Technology (PCAST) met to discuss and vote to approve a new report on big data and privacy.

UPDATE: The White House published the findings of its review on big data today, including the PCAST review of technologies underpinning big data (PDF), discussed below.

As White House special advisor John Podesta noted in January, the PCAST has been conducting a study “to explore in-depth the technological dimensions of the intersection of big data and privacy.” Earlier this week, the Associated Press interviewed Podesta about the results of the review, reporting that the White House had learned of the potential for discrimination through the use of data aggregation and analysis. These are precisely the privacy concerns that stem from data collection that I wrote about earlier this spring. Here’s the PCAST’s list of “things happening today or very soon” that provide examples of technologies that can have benefits but pose privacy risks:

 Pioneered more than a decade ago, devices mounted on utility poles are able to sense the radio stations
being listened to by passing drivers, with the results sold to advertisers.26
 In 2011, automatic license‐plate readers were in use by three quarters of local police departments
surveyed.  Within 5 years, 25% of departments expect to have them installed on all patrol cars, alerting
police when a vehicle associated with an outstanding warrant is in view.27  Meanwhile, civilian uses of
license‐plate readers are emerging, leveraging cloud platforms and promising multiple ways of using the
information collected.28
 Experts at the Massachusetts Institute of Technology and the Cambridge Police Department have used a
machine‐learning algorithm to identify which burglaries likely were committed by the same offender,
thus aiding police investigators.29
 Differential pricing (offering different prices to different customers for essentially the same goods) has
become familiar in domains such as airline tickets and college costs.  Big data may increase the power
and prevalence of this practice and may also decrease even further its transparency.30
 reSpace offers machine‐learning algorithms to the gaming industry that may detect
early signs of gambling addiction or other aberrant behavior among online players.31
 Retailers like CVS and AutoZone analyze their customers’ shopping patterns to improve the layout of
their stores and stock the products their customers want in a particular location.32  By tracking cell
phones, RetailNext offers bricks‐and‐mortar retailers the chance to recognize returning customers, just
as cookies allow them to be recognized by on‐line merchants.33  Similar WiFi tracking technology could
detect how many people are in a closed room (and in some cases their identities).
 The retailer Target inferred that a teenage customer was pregnant and, by mailing her coupons
intended to be useful, unintentionally disclosed this fact to her father.34
 The author of an anonymous book, magazine article, or web posting is frequently “outed” by informal
crowd sourcing, fueled by the natural curiosity of many unrelated individuals.35
 Social media and public sources of records make it easy for anyone to infer the network of friends and
associates of most people who are active on the web, and many who are not.36
 Marist College in Poughkeepsie, New York, uses predictive modeling to identify college students who are
at risk of dropping out, allowing it to target additional support to those in need.37
 The Durkheim Project, funded by the U.S. Department of Defense, analyzes social‐media behavior to
detect early signs of suicidal thoughts among veterans.38
 LendUp, a California‐based startup, sought to use nontraditional data sources such as social media to
provide credit to underserved individuals.  Because of the challenges in ensuring accuracy and fairness,
however, they have been unable to proceed.

The PCAST meeting was open to the public through a teleconference line. I called in and took rough notes on the discussion of the forthcoming report as it progressed. My notes on the comments of professors Susan Graham and Bill Press offer sufficient insight and into the forthcoming report, however, that I thought the public value of publishing them was warranted today, given the ongoing national debate regarding data collection, analysis, privacy and surveillance. The following should not be considered verbatim or an official transcript. The emphases below are mine, as are the words of [brackets]. For that, look for the PCAST to make a recording and transcript available online in the future, at its archive of past meetings.

Susan Graham: Our charge was to look at confluence of big data and privacy, to summarize current tech and the way technology is moving in foreseeable future, including its influence the way we think about privacy.

The first thing that’s very very obvious is that personal data in electronic form is pervasive. Traditional data that was in health and financial [paper] records is now electronic and online. Users provide info about themselves in exchange for various services. They use Web browsers and share their interests. They provide information via social media, Facebook, LinkedIn, Twitter. There is [also] data collected that is invisible, from public cameras, microphones, and sensors.

What is unusual about this environment and big data is the ability to do analysis in huge corpuses of that data. We can learn things from the data that allow us to provide a lot of societal benefits. There is an enormous amount of patient data, data about about disease, and data about genetics. By putting it together, we can learn about treatment. With enough data, we can look at rare diseases, and learn what has been effective. We could not have done this otherwise.

We can analyze more online information about education and learning, not only MOOCs but lots of learning environments. [Analysis] can tell teachers how to present material effectively, to do comparisons about whether one presentation of information works better than another, or analyze how well assessments work with learning styles.
Certain visual information is comprehensible, certain verbal information is hard to understand. Understanding different learning styles [can enable] develop customized teaching.

The reason this all works is the profound nature of analysis. This is the idea of data fusion, where you take multiple sources of information, combine them, which provides much richer picture of some phenomenon. If you look at patterns of human movements on public transport, or pollution measures, or weather, maybe we can predict dynamics caused by human context.

We can use statistics to do statistics-based pattern recognition on large amounts of data. One of the things that we understand about this statistics-based approach is that it might not be 100% accurate if map down to the individual providing data in these patterns. We have to very careful not to make mistakes about individuals because we make [an inference] about a population.

How do we think about privacy? We looked at it from the point of view of harms. There are a variety of ways in which results of big data can create harm, including inappropriate disclosures [of personal information], potential discrimination against groups, classes, or individuals, and embarrassment to individuals or groups.

We turned to what tech has to offer in helping to reduce harms. We looked at a number of technologies in use now. We looked at a bunch coming down the pike. We looked at several tech in use, some of which become less effective because of pervasivesness [of data] and depth of analytics.

We traditionally have controlled [data] collection. We have seen some data collection from cameras and sensors that people don’t know about. If you don’t know, it’s hard to control.

Tech creates many concerns. We have looked at methods coming down the pike. Some are more robust and responsive. We have a number of draft recommendations that we are still working out.

Part of privacy is protecting the data using security methods. That needs to continue. It needs to be used routinely. Security is not the same as privacy, though security helps to protect privacy. There are a number of approaches that are now used by hand that with sufficient research could be automated could be used more reliably, so they scale.

There needs to be more research and education about education about privacy. Professionals need to understand how to treat privacy concerns anytime they deal with personal data. We need to create a large group of professionals who understand privacy, and privacy concerns, in tech.

Technology alone cannot reduce privacy risks. There has to be a policy as well. It was not our role to say what that policy should be. We need to lead by example by using good privacy protecting practices in what the government does and increasingly what the private sector does.

Bill Press: We tried throughout to think of scenarios and examples. There’s a whole chapter [in the report] devoted explicitly to that.

They range from things being done today, present technology, even though they are not all known to people, to our extrapolations to the outer limits, of what might well happen in next ten years. We tried to balance examples by showing both benefits, they’re great, and they raise challenges, they raise the possibility of new privacy issues.

In another aspect, in Chapter 3, we tried to survey technologies from both sides, with both tech going to bring benefits, those that will protect [people], and also those that will raise concerns.

In our technology survey, we were very much helped by the team at the National Science Foundation. They provided a very clear, detailed outline of where they thought that technology was going.

This was part of our outreach to a large number of experts and members of the public. That doesn’t mean that they agree with our conclusions.

Eric Lander: Can you take everybody through analysis of encryption? Are people using much more? What are the limits?

Graham: The idea behind classical encryption is that when data is stored, when it’s sitting around in a database, let’s say, encryption entangles the representation of the data so that it can’t be read without using a mathematical algorithm and a key to convert a seemingly set of meaningless set of bits into something reasonable.

The same technology, where you convert and change meaningless bits, is used when you send data from one place to another. So, if someone is scanning traffic on internet, you can’t read it. Over the years, we’ve developed pretty robust ways of doing encryption.

The weak link is that to use data, you have to read it, and it becomes unencrypted. Security technologists worry about it being read in the short time.

Encryption technology is vulnerable. The key that unlocks the data is itself vulnerable to theft or getting the wrong user to decrypt.

Both problems of encryption are active topics of research on how to use data without being able to read it. There research on increasingly robustness of encryption, so if a key is disclosed, you haven’t lost everything and you can protect some of data or future encryption of new data. This reduces risk a great deal and is important to use. Encryption alone doesn’t protect.

Unknown Speaker: People read of breaches derived from security. I see a different set of issues of privacy from big data vs those in security. Can you distinguish them?

Bill Press: Privacy and security are different issues. Security is necessary to have good privacy in the technological sense if communications are insecure, they clearly can’t be private. This goes beyond, to where parties that are authorized, in a security sense, to see the information. Privacy is much closer to values. security is much closer to protocols.

Interesting thing is that this is less about purely tech elements — everyone can agree on right protocol, eventually. These things that go beyond and have to do with values.

	Miles on Senate passes evidence-based p…
	FBF: The last week b… on Why the Open Government Partne…
	The U.S. government… on What was missing from Presiden…
	An early assessment… on Open Government Partnership IR…
	An early assessment… on In 2021, We the People need a…

E Pluribus Unum

Monthly Archives: May 2014

Obama administration announces new initiatives to release and apply open energy data

Harvard Law study finds Supreme Court editing its decisions without notice

District of Columbia to experiment with collaborative lawmaking online

FAQ on Net Neutrality and FCC NPRM on Proposed Open Internet Rules