If knowledge is power, ignorance is impotence. Citizens, consumers, investors, and patients all need trustworthy information when we vote, making purchasing decisions, buy stocks or other assets, or choose a surgeon, medical device, nursing home, or dialysis center. That’s why … Continue reading
The White House has launched COVIDTests.gov, which the Biden administration says will enable every home in the U.S. to order 4 free at-home COVID-19 tests through the mail, starting on January 19th — with no shipping costs or credit card required. Ideally, the administration will also allow Americans to request the high-quality masks President Biden said the US government would distribute through COVIDTests.gov as well.
As with Vaccines.gov, there’s a tremendous amount riding on the Biden administration delivering on this news service. Hundreds of millions of Americans REALLY need this administration to deliver on sending free tests and masks to the people through the mail who ask for them right now. This will be a simple but profound interaction.
If the White House can pull it off, it could rebuild public trust during a profoundly uneasy time, delivering the tests and masks that — with vaccines — would enable us all to navigate out of the pandemic together.
If COVIDTests.gov were to be overwhelmed by demand, flooding attacks, automated fraud, or test distribution is botched by the U.S. Postal Service, public trust could erode further.Last week, Politico reporter Ben Leonard reached out and asked a series of questions about whether the Biden administration would be able to deliver on a website to request rapid tests, raising concerns about whether this be a repeat of the Healthcare.gov debacle of almost a decade ago. The answers below lay out the case for why this White House is likely to succeed.
What were the key failures of healthcare.gov?
The public and major media focused a lot on the technology angle because healthcare.gov is a website, and it’s true that technical and design decisions caused major problems at the relaunch, when the Department of Health and Human Services moved it from being a glossy brochure to being a marketplace for health insurance. Lack of beta testing. Hosting at that couldn’t scale to meet demand. Incomplete integration between federal agency systems. Artificial bottlenecks in the marketplace flow.
That all led to a crisis when Americans began trying to use the site, in no small part because Healthcare.gov wasn’t iteratively built or tested “in the open” using modern software development practices, with the people it was meant to serve.
Much like the endemic IT failures that have bedeviled big state and federal projects for years, however, the fundamental problems stemmed from:
- Poor project management and oversight
- A government procurement system that builds and buys software like buildings and cars
- Outsourcing huge contracts to systems integrators and contractors
- Challenges recruiting and retaining technologists who must be sitting at the table from the beginning of a complex project
The team that rescued the site in the winter of 2013 was able to address many of these failures and then founded the U.S. Digital Service based upon these insights and the inspiration of Gov.uk in the United Kingdom.
- How can the Biden administration avoid them this time around?
The White House can give the U.S. Digital Service and 18F (the software development organization inside of the General Services Administration which could be involved) – resources and cover to do their best work. (Ideally, they will “show their work” as they go, too.) That means strong product management, iterative development,regular check-ins, designers and technologists in government, and working with best-in-class technology partners who understand how to build and scale modern responsive websites.
It’s also worth noting that building a website to request a COVID-19 test is not the same technical challenge as one that had to tie into the IRS to check eligibility for subsidies and complete a secure transaction. Failure would still be consequential, but the bar is much lower, as are the risks surrounding errors.
- What were the key issues with Trump’s promised national COVID-19 screening website? What lessons can be taken from it?
Former President Trump was unfit to lead a coordinated national response to a pandemic and uninterested in building out the national testing infrastructure that showed his lies about the prevalence of a deadly airborne virus to be false.
This promised website was vaporware, not a serious project that can or should be compared to past or present .gov efforts. The Trump White House never built or delivered anything after the President engaged in misleading hyperbole in the Rose Garden.
Instead of convening technologists, designers, and project managers from relevant agencies and private sector in a Manhattan Project for testing or grand national challenge, however, Trump misled the public by claiming that Google was already working on it. There’s no “there there” to compare.
- Vaccines.gov seemed to roll out more smoothly. Why do you think that happened and what lessons can be drawn?
I’d rack that up to:
- Competent leadership at the U.S. Digital Service involved from the outset in coordinating and managing the project, top-down cover from the Oval Office
- Subject matter expertise from the medical professionals who built VaccineFinder.org
- Deep technical expertise and capacity to deliver at scale from private sector companies involved
- All-out, mission-driven effort from many patriots inside and outside of government committed to connecting Americans with vaccines. A thread: https://twitter.com/digiphile/status/1388495731422543877
We all would know a lot more about what worked and why if the Biden administration had narrated its work in the open, held a press conference about Vaccines.gov, and taken questions instead of giving the news to Bloomberg on background.
- How big of a technical task do you think putting together this website will be?
If the functionality of COVIDTests.gov is limited to someone requesting a test be delivered to a given address, I don’t think that’s a big task for US government in 2022, even under heavy demand. If the COVIDTests.gov needs to authenticate someone and create accounts to prevent fraud or abuse, that will be a bit harder.
My hope is that we’ll see a lightweight website that uses Login.gov and a shortcode — say, text your zipcode to GETTST – that will enable people to quickly and easily request tests from a smartphone – along with a package of better, medical grade masks for themselves and their children that President Biden announced today would be made available for free to all Americans.
Last month, I traveled to Moldova to speak at a “smart society” summit hosted by the Moldovan national e-government center and the World Bank. I talked about what I’ve been seeing and reporting on around the world and some broad principles for “smart government.” It was one of the first keynote talks I’ve ever given and, from what I gather, it went well: the Moldovan government asked me to give a reprise to their cabinet and prime minister the next day.
I’ve embedded the entirety of the morning session above, including my talk (which is about half an hour long). I was preceded by professor Beth Noveck, the former deputy CTO for open government at The White House. If you watch the entire program, you’ll hear from:
- Victor Bodiu, General Secretary, Government of the Republic of Moldova, National Coordinator, Governance e-Transformation Agenda
- Dona Scola, Deputy Minister, Ministry of Information Technology and Communication
- Andrew Stott, UK Transparency Board, former UK Government Director for Transparency and Digital Engagement
- Victor Bodiu, General Secretary, Government of the Republic of Moldova
- Arcadie Barbarosie, Executive Director, Institute of Public Policy, Moldova
Without planning on it, I managed to deliver a one-liner that morning that’s worth rephrasing and reiterating here: Smart government should not just serve citizens with smartphones.
I look forward to your thoughts and comments, for those of you who make it through the whole keynote.
On Friday night, a packed room of eager potential entrepreneurs, developers and curious citizens watched US CTO Todd Park and Bill Eggers kick off Startup Weekend DC in Microsoft’s offices in Chevy Chase, Maryland.
— Alex Howard (@digiphile) June 15, 2012
//platform.twitter.com/widgets.jsPark brought his customary energy and geeky humor to his short talk, pitching the assembled crowd on using open government data in their ideas.
— Alex Howard (@digiphile) June 15, 2012
Park wants to inject open data as a “fuel” into the economy. After talking about the success of the Health Data Initiative and the Health Datapalooza, he shared a series of websites were aspiring entrepreneurs could find data to use:
Park also made an “ask” of the attendees of Startup Weekend DC that I haven’t heard from many government officials: he requested that if they A) use the data and/or B) if they run into any trouble accessing it, to let him know.
“If you had a hard time or found a particular restful API moving, let me know,” he said. “It helps us improve our performance.” And then he gave out his email address at the White House Executive Office of the President, as he did at SXSW Interactive in Austin in March of this year. Asking the public for feedback on data quality — particularly entrepreneurs and developers — and providing contact information to do so is, to put it bluntly, something every city and state official that has stood up and open data platform could and should be doing. In this context, the US CTO has set a notable example for the country.
Examples of startups, gap filling and civic innovation
Following Park, author and Deloitte consultant Bill Eggers talked about innovative startups and the public sector. I’ve embedded video of his talk below:
Eggers cited three different startups in his talk: Recycle Bank, Avego and Kaggle.
1) The outcome of Recycle Bank‘s influence was a 19-fold increase in recycling in some cities from gamification, said Eggers. The startup now has 3 million members and is now setting its sights on New York City.
2) The real-time ridesharing provided by Avego holds the promise to hugely reduce traffic congestion, said Eggers. According to the stats he cited, 80% of people on the road are currently driving in cars by themselves. Avego has raised tens of millions of dollars to try to better optimize transportation.
3) Anthony Goldbloom found a hole in the big data market at Kaggle, said Eggers, where they’re matching data challenges with data scientists. There now some 19,000 registered data scientists in the Kaggle database.
Eggers cited the success of a competition to map dark matter on Kaggle, a problem that had had millions spent on it. The results of open innovation here were better than science had been able to achieve prior to the competition. Kaggle has created a market out of writing better algorithms.
— Alex Howard (@digiphile) June 15, 2012
//platform.twitter.com/widgets.jsAfter Eggers spoke, the organizers of Startup Weekend explained how the rest of the weekend would proceed and asked attendees to pitch their ideas. One particular idea, for this correspondent, stood out, primarily because of the young fellows pitching it:
— Alex Howard (@digiphile) June 16, 2012
In 2012, making sense of big data through narrative and context, particularly unstructured data, is now a strategic imperative for leaders around the world, whether they serve in Washington, run media companies or trading floors in New York City or guide tech titans in Silicon Valley.
While big data carries the baggage of huge hype, the institutions of federal government are getting serious about its genuine promise. On Thursday morning, the Obama Administration announced a “Big Data Research and Development Initiative,” with more than $200 million in new commitments. (See fact sheet provided by the White House Office of Science and technology policy at the bottom of this post.)
“In the same way that past Federal investments in information-technology R&D led to dramatic advances in supercomputing and the creation of the Internet, the initiative we are launching today promises to transform our ability to use Big Data for scientific discovery, environmental and biomedical research, education, and national security,” said Dr. John P. Holdren, Assistant to the President and Director of the White House Office of Science and Technology Policy, in a prepared statement.
The research and development effort will focus on advancing “state-of-the-art core technologies” need for big data, harnessing said technologies “to accelerate the pace of discovery in science and engineering, strengthen our national security, and transform teaching and learning,” and “expand the workforce needed to develop and use Big Data technologies.”
In other words, the nation’s major research institutions will focus on improving available technology to collect and use big data, apply them to science and national security, and look for ways to train more data scientists.
“IBM views Big Data as organizations’ most valuable natural resource, and the ability to use technology to understand it holds enormous promise for society at large,” said David McQueeney, vice president of software, IBM Research, in a statement. “The Administration’s work to advance research and funding of big data projects, in partnership with the private sector, will help federal agencies accelerate innovations in science, engineering, education, business and government.”
While $200 million dollars is a relatively small amount of funding, particularly in the context of the federal budget or as compared to investments that are (probably) being made by Google or other major tech players, specific support for training and subsequent application of big data within federal government is important and sorely needed. The job market for data scientists in the private sector is so hot that government may well need to build up its own internal expertise, much in the same way Living Social is training coders at the Hungry Academy.
“Big data is a big deal,” blogged Tom Kalil, deputy director for policy at White House OSTP, at the White House blog this morning.
We also want to challenge industry, research universities, and non-profits to join with the Administration to make the most of the opportunities created by Big Data. Clearly, the government can’t do this on its own. We need what the President calls an “all hands on deck” effort.
Some companies are already sponsoring Big Data-related competitions, and providing funding for university research. Universities are beginning to create new courses—and entire courses of study—to prepare the next generation of “data scientists.” Organizations like Data Without Borders are helping non-profits by providing pro bono data collection, analysis, and visualization. OSTP would be very interested in supporting the creation of a forum to highlight new public-private partnerships related to Big Data.
The White House is hosting a forum today in Washington to explore the challenges and opportunities of big data and discuss the investment. The event will be streamed online in live webcast from the headquarters of the AAAS in Washington, DC. I’ll be in attendance and sharing what I learn.
“Researchers in a growing number of fields are generating extremely large and complicated data sets, commonly referred to as ‘big data,'” reads the invitation to the event from the White House Office of Science and Technology Policy. “A wealth of information may be found within these sets, with enormous potential to shed light on some of the toughest and most pressing challenges facing the nation. To capitalize on this unprecedented opportunity — to extract insights, discover new patterns and make new connections across disciplines — we need better tools to access, store, search, visualize, and analyze these data.”
- John Holdren, Assistant to the President and Director, White House Office of Science and Technology Policy
- Subra Suresh, Director, National Science Foundation
- Francis Collins, Director, National Institutes of Health
- William Brinkman, Director, Department of Energy Office of Science
- Moderator: Steve Lohr, New York Times, author of “Big Data’s Impact in the World“
- Alex Szalay, Johns Hopkins University
- Lucila Ohno-Machado, UC San Diego
- Daphne Koller, Stanford
- James Manyika, McKinsey
What is big data?
Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it.
The hot IT buzzword of 2012, big data has become viable as cost-effective approaches have emerged to tame the volume, velocity and variability of massive data. Within this data lie valuable patterns and information, previously hidden because of the amount of work required to extract them. To leading corporations, such as Walmart or Google, this power has been in reach for some time, but at fantastic cost. Today’s commodity hardware, cloud architectures and open source software bring big data processing into the reach of the less well-resourced. Big data processing is eminently feasible for even the small garage startups, who can cheaply rent server time in the cloud.
Teams of data scientists are increasingly leveraging a powerful, growing set of common tools, whether they’re employed by government technologists opening cities, developers driving a revolution in healthcare or hacks and hackers defining the practice of data journalism.
To learn more about the growing ecosystem of big data tools, watch my interview with Cloudera architect Doug Cutting, embedded below. @Cutting created Lucerne and led the Hadoop project at Yahoo before he joined Cloudera. Apache Hadoop is an open source framework that allows distributed applications based upon the MapReduce paradigm to run on immense clusters of commodity hardware, which in turn enables the processing of massive amounts of big data.
Details on the administration’s big data investments
A fact sheet released by the White House OSTP follows, verbatim:
“National Science Foundation and the National Institutes of Health – Core Techniques and Technologies for Advancing Big Data Science & Engineering
“Big Data” is a new joint solicitation supported by the National Science Foundation (NSF) and the National Institutes of Health (NIH) that will advance the core scientific and technological means of managing, analyzing, visualizing, and extracting useful information from large and diverse data sets. This will accelerate scientific discovery and lead to new fields of inquiry that would otherwise not be possible. NIH is particularly interested in imaging, molecular, cellular, electrophysiological, chemical, behavioral, epidemiological, clinical, and other data sets related to health and disease.
National Science Foundation: In addition to funding the Big Data solicitation, and keeping with its focus on basic research, NSF is implementing a comprehensive, long-term strategy that includes new methods to derive knowledge from data; infrastructure to manage, curate, and serve data to communities; and new approaches to education and workforce development. Specifically, NSF is:
· Encouraging research universities to develop interdisciplinary graduate programs to prepare the next generation of data scientists and engineers;
· Funding a $10 million Expeditions in Computing project based at the University of California, Berkeley, that will integrate three powerful approaches for turning data into information – machine learning, cloud computing, and crowd sourcing;
· Providing the first round of grants to support “EarthCube” – a system that will allow geoscientists to access, analyze and share information about our planet;
Issuing a $2 million award for a research training group to support training for undergraduates to use graphical and visualization techniques for complex data.
Providing $1.4 million in support for a focused research group of statisticians and biologists to determine protein structures and biological pathways.
· Convening researchers across disciplines to determine how Big Data can transform teaching and learning.
Department of Defense – Data to Decisions: The Department of Defense (DoD) is “placing a big bet on big data” investing approximately $250 million annually (with $60 million available for new research projects) across the Military Departments in a series of programs that will:
*Harness and utilize massive data in new ways and bring together sensing, perception and decision support to make truly autonomous systems that can maneuver and make decisions on their own.
*Improve situational awareness to help warfighters and analysts and provide increased support to operations. The Department is seeking a 100-fold increase in the ability of analysts to extract information from texts in any language, and a similar increase in the number of objects, activities, and events that an analyst can observe.
To accelerate innovation in Big Data that meets these and other requirements, DoD will announce a series of open prize competitions over the next several months.
In addition, the Defense Advanced Research Projects Agency (DARPA) is beginning the XDATA program, which intends to invest approximately $25 million annually for four years to develop computational techniques and software tools for analyzing large volumes of data, both semi-structured (e.g., tabular, relational, categorical, meta-data) and unstructured (e.g., text documents, message traffic). Central challenges to be addressed include:
· Developing scalable algorithms for processing imperfect data in distributed data stores; and
· Creating effective human-computer interaction tools for facilitating rapidly customizable visual reasoning for diverse missions.
The XDATA program will support open source software toolkits to enable flexible software development for users to process large volumes of data in timelines commensurate with mission workflows of targeted defense applications.
National Institutes of Health – 1000 Genomes Project Data Available on Cloud: The National Institutes of Health is announcing that the world’s largest set of data on human genetic variation – produced by the international 1000 Genomes Project – is now freely available on the Amazon Web Services (AWS) cloud. At 200 terabytes – the equivalent of 16 million file cabinets filled with text, or more than 30,000 standard DVDs – the current 1000 Genomes Project data set is a prime example of big data, where data sets become so massive that few researchers have the computing power to make best use of them. AWS is storing the 1000 Genomes Project as a publically available data set for free and researchers only will pay for the computing services that they use.
Department of Energy – Scientific Discovery Through Advanced Computing: The Department of Energy will provide $25 million in funding to establish the Scalable Data Management, Analysis and Visualization (SDAV) Institute. Led by the Energy Department’s Lawrence Berkeley National Laboratory, the SDAV Institute will bring together the expertise of six national laboratories and seven universities to develop new tools to help scientists manage and visualize data on the Department’s supercomputers, which will further streamline the processes that lead to discoveries made by scientists using the Department’s research facilities. The need for these new tools has grown as the simulations running on the Department’s supercomputers have increased in size and complexity.
US Geological Survey – Big Data for Earth System Science: USGS is announcing the latest awardees for grants it issues through its John Wesley Powell Center for Analysis and Synthesis. The Center catalyzes innovative thinking in Earth system science by providing scientists a place and time for in-depth analysis, state-of-the-art computing capabilities, and collaborative tools invaluable for making sense of huge data sets. These Big Data projects will improve our understanding of issues such as species response to climate change, earthquake recurrence rates, and the next generation of ecological indicators.”
Further details about each department’s or agency’s commitments can be found at the following websites by 2 pm today:
IBM infographic on big data
This post and headline have been updated as more information on the big data R&D initiative became available.
From healthcare to finance to emergency response, data holds immense potential to help citizens and government. Putting data to work for the public good, however, will require data journalists to apply the powerful emerging tools in the newsroom stack to the explosion of information from government, business and their fellow citizens. The promise of data journalism has been a strong theme throughout the National Institute for Computer-Assisted Reporting’s (NICAR) 2012 conference.
It was in that context that I presented upon “Open Data Journalism” this morning, which, to paraphrase Jonathan Stray, I’d define as obtaining, reporting upon, curating and publishing open data in the public interest. My slides, which broadly describe what I’m seeing in the world of open government today, are embedded below.
Comments welcome, as ever.
Update: In the context of fauxpen data, beware “openwashing:” Simply opening up data is not a replacement for a Constitution that enforces a rule of law, free and fair elections, an effective judiciary, decent schools, basic regulatory bodies or civil society — particularly if the data does not relate to meaningful aspects of society. Adopting open data and digital government reforms is not quite the same thing as good government, although they certainly can be and are related, in some cases.
If a country launches an open data platform but deprecates freedom of the press or assembly, questions freedom of information laws or restricts the ability of government scientists to speak to the public, is it adopting “open government” — or doing something else?
If you’re a regular reader of Govfresh or the O’Reilly Radar, you know how the chief technology officer of the U.S. Department of Health and Human Services, Todd Park ,is focused on unleashing the power of open data to improve health. If you aren’t familiar with this story, go read Simon Owen’s excellent feature article that explores his work on revolutionizing the healthcare industry. Part of unlocking innovation through open health data has been a relentless promotion and evangelization of the data that HHS has to venture capitalists, the healthcare industry and developers. It was in that context that Park visited New York’s Hacks and Hackers meetup today. The video of the meeting is embedded below, including a lengthy question and answer period at the end.
NYC Hacks and Hackers co-organizer Chrys Wu was kind enough to ask my questions, posed over Twitter. Here were the answers I pulled out from the video above:
How much data has been released? Park: “A ton.” He pointed to HealthData.gov as a scorecard and said that HHS isn’t just releasing brand new data. They’re “also making existing data truly accessible or usable,” he said. They’re taking “stuff that’s in a book or website and turning it into machine readable data or an API.”
What formats? Park: Lots and lots of different formats. “Some people put spreadsheets online, other people actually create open APIs and open services,” he said. “We’re trying to migrate people as much towards open API as possible.”
Impact to date? “The best quantification that I can articulate is the Health data-palooza,” he said. “50 companies and nonprofits updated and deployed new versions of their platforms and services. The data already helping millions of Americans in all kinds of ways.”
Park emphasized that it’s still quite early for the project, at only 18 months into this. He also emphasized that the work isn’t just about data: it’s about how and where it’s used. “Data by itself isn’t useful. You don’t go and download data and slather data on yourself and get healed,” he said. “Data is useful when it’s integrated with other stuff that does useful jobs for doctors, patients and consumers.”
What if open health data were to be harnessed to spur better healthcare decisions and catalyze the extension or creation of new businesses? That potential future exists now, in the present.
Todd Park, chief technology officer of the Department of Heath and Human Services, has been working to unlock innovation through open health data for over a year now. On many levels, the effort is the best story in federal open data. In the video below, he talks with my publisher, Tim O’Reilly, about collaboration and innovation in the healthcare system.
The next big event in this space on June 9 at the NIH. If you’re interested in what’s next for open health data, track this event closely.
[Hat tip: PharmFresh]