Since I heard that last year’s Gov 2.0 and Open Government Events Calendar was useful to the broader community, here’s this year’s version. There will be many other places around the globe for people to gather, talk and learn about Gov 2.0 in 2012 — just take a look through the many Govloop event listings. There will be any number of citizen-generated unconferences and hackathons, where the attendees generate the program. They’ll include CityCamps, BarCamps, PodCamps or MobileCamps. Check out the CityCamp calendar to find one near you and keep an eye out for CityCamp meetups in February.
The following listings are by no means comprehensive but should serve as a starting point if you’re wondering what’s happening, when and where. If you know about more Gov 2.0 events that should be listed here, please let me know at alex@oreilly.com or @digiphile.
In 2012, making sense of big data through narrative and context, particularly unstructured data, is now a strategic imperative for leaders around the world, whether they serve in Washington, run media companies or trading floors in New York City or guide tech titans in Silicon Valley.
While big data carries the baggage of huge hype, the institutions of federal government are getting serious about its genuine promise. On Thursday morning, the Obama Administration announced a “Big Data Research and Development Initiative,” with more than $200 million in new commitments. (See fact sheet provided by the White House Office of Science and technology policy at the bottom of this post.)
“In the same way that past Federal investments in information-technology R&D led to dramatic advances in supercomputing and the creation of the Internet, the initiative we are launching today promises to transform our ability to use Big Data for scientific discovery, environmental and biomedical research, education, and national security,” said Dr. John P. Holdren, Assistant to the President and Director of the White House Office of Science and Technology Policy, in a prepared statement.
The research and development effort will focus on advancing “state-of-the-art core technologies” need for big data, harnessing said technologies “to accelerate the pace of discovery in science and engineering, strengthen our national security, and transform teaching and learning,” and “expand the workforce needed to develop and use Big Data technologies.”
In other words, the nation’s major research institutions will focus on improving available technology to collect and use big data, apply them to science and national security, and look for ways to train more data scientists.
“IBM views Big Data as organizations’ most valuable natural resource, and the ability to use technology to understand it holds enormous promise for society at large,” said David McQueeney, vice president of software, IBM Research, in a statement. “The Administration’s work to advance research and funding of big data projects, in partnership with the private sector, will help federal agencies accelerate innovations in science, engineering, education, business and government.”
While $200 million dollars is a relatively small amount of funding, particularly in the context of the federal budget or as compared to investments that are (probably) being made by Google or other major tech players, specific support for training and subsequent application of big data within federal government is important and sorely needed. The job market for data scientists in the private sector is so hot that government may well need to build up its own internal expertise, much in the same way Living Social is training coders at the Hungry Academy.
“Big data is a big deal,” blogged Tom Kalil, deputy director for policy at White House OSTP, at the White House blog this morning.
We also want to challenge industry, research universities, and non-profits to join with the Administration to make the most of the opportunities created by Big Data. Clearly, the government can’t do this on its own. We need what the President calls an “all hands on deck” effort.
Some companies are already sponsoring Big Data-related competitions, and providing funding for university research. Universities are beginning to create new courses—and entire courses of study—to prepare the next generation of “data scientists.” Organizations like Data Without Borders are helping non-profits by providing pro bono data collection, analysis, and visualization. OSTP would be very interested in supporting the creation of a forum to highlight new public-private partnerships related to Big Data.
The White House is hosting a forum today in Washington to explore the challenges and opportunities of big data and discuss the investment. The event will be streamed online in live webcast from the headquarters of the AAAS in Washington, DC. I’ll be in attendance and sharing what I learn.
“Researchers in a growing number of fields are generating extremely large and complicated data sets, commonly referred to as ‘big data,'” reads the invitation to the event from the White House Office of Science and Technology Policy. “A wealth of information may be found within these sets, with enormous potential to shed light on some of the toughest and most pressing challenges facing the nation. To capitalize on this unprecedented opportunity — to extract insights, discover new patterns and make new connections across disciplines — we need better tools to access, store, search, visualize, and analyze these data.”
Speakers:
John Holdren, Assistant to the President and Director, White House Office of Science and Technology Policy
Subra Suresh, Director, National Science Foundation
Francis Collins, Director, National Institutes of Health
William Brinkman, Director, Department of Energy Office of Science
Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it.
The hot IT buzzword of 2012, big data has become viable as cost-effective approaches have emerged to tame the volume, velocity and variability of massive data. Within this data lie valuable patterns and information, previously hidden because of the amount of work required to extract them. To leading corporations, such as Walmart or Google, this power has been in reach for some time, but at fantastic cost. Today’s commodity hardware, cloud architectures and open source software bring big data processing into the reach of the less well-resourced. Big data processing is eminently feasible for even the small garage startups, who can cheaply rent server time in the cloud.
To learn more about the growing ecosystem of big data tools, watch my interview with Cloudera architect Doug Cutting, embedded below. @Cutting created Lucerne and led the Hadoop project at Yahoo before he joined Cloudera. Apache Hadoop is an open source framework that allows distributed applications based upon the MapReduce paradigm to run on immense clusters of commodity hardware, which in turn enables the processing of massive amounts of big data.
Details on the administration’s big data investments
A fact sheet released by the White House OSTP follows, verbatim:
“National Science Foundation and the National Institutes of Health – Core Techniques and Technologies for Advancing Big Data Science & Engineering
“Big Data” is a new joint solicitation supported by the National Science Foundation (NSF) and the National Institutes of Health (NIH) that will advance the core scientific and technological means of managing, analyzing, visualizing, and extracting useful information from large and diverse data sets. This will accelerate scientific discovery and lead to new fields of inquiry that would otherwise not be possible. NIH is particularly interested in imaging, molecular, cellular, electrophysiological, chemical, behavioral, epidemiological, clinical, and other data sets related to health and disease.
National Science Foundation: In addition to funding the Big Data solicitation, and keeping with its focus on basic research, NSF is implementing a comprehensive, long-term strategy that includes new methods to derive knowledge from data; infrastructure to manage, curate, and serve data to communities; and new approaches to education and workforce development. Specifically, NSF is:
· Encouraging research universities to develop interdisciplinary graduate programs to prepare the next generation of data scientists and engineers;
· Funding a $10 million Expeditions in Computing project based at the University of California, Berkeley, that will integrate three powerful approaches for turning data into information – machine learning, cloud computing, and crowd sourcing;
· Providing the first round of grants to support “EarthCube” – a system that will allow geoscientists to access, analyze and share information about our planet;
Issuing a $2 million award for a research training group to support training for undergraduates to use graphical and visualization techniques for complex data.
Providing $1.4 million in support for a focused research group of statisticians and biologists to determine protein structures and biological pathways.
· Convening researchers across disciplines to determine how Big Data can transform teaching and learning.
Department of Defense – Data to Decisions: The Department of Defense (DoD) is “placing a big bet on big data” investing approximately $250 million annually (with $60 million available for new research projects) across the Military Departments in a series of programs that will:
*Harness and utilize massive data in new ways and bring together sensing, perception and decision support to make truly autonomous systems that can maneuver and make decisions on their own.
*Improve situational awareness to help warfighters and analysts and provide increased support to operations. The Department is seeking a 100-fold increase in the ability of analysts to extract information from texts in any language, and a similar increase in the number of objects, activities, and events that an analyst can observe.
To accelerate innovation in Big Data that meets these and other requirements, DoD will announce a series of open prize competitions over the next several months.
In addition, the Defense Advanced Research Projects Agency (DARPA) is beginning the XDATA program, which intends to invest approximately $25 million annually for four years to develop computational techniques and software tools for analyzing large volumes of data, both semi-structured (e.g., tabular, relational, categorical, meta-data) and unstructured (e.g., text documents, message traffic). Central challenges to be addressed include:
· Developing scalable algorithms for processing imperfect data in distributed data stores; and
· Creating effective human-computer interaction tools for facilitating rapidly customizable visual reasoning for diverse missions.
The XDATA program will support open source software toolkits to enable flexible software development for users to process large volumes of data in timelines commensurate with mission workflows of targeted defense applications.
National Institutes of Health – 1000 Genomes Project Data Available on Cloud: The National Institutes of Health is announcing that the world’s largest set of data on human genetic variation – produced by the international 1000 Genomes Project – is now freely available on the Amazon Web Services (AWS) cloud. At 200 terabytes – the equivalent of 16 million file cabinets filled with text, or more than 30,000 standard DVDs – the current 1000 Genomes Project data set is a prime example of big data, where data sets become so massive that few researchers have the computing power to make best use of them. AWS is storing the 1000 Genomes Project as a publically available data set for free and researchers only will pay for the computing services that they use.
Department of Energy – Scientific Discovery Through Advanced Computing: The Department of Energy will provide $25 million in funding to establish the Scalable Data Management, Analysis and Visualization (SDAV) Institute. Led by the Energy Department’s Lawrence Berkeley National Laboratory, the SDAV Institute will bring together the expertise of six national laboratories and seven universities to develop new tools to help scientists manage and visualize data on the Department’s supercomputers, which will further streamline the processes that lead to discoveries made by scientists using the Department’s research facilities. The need for these new tools has grown as the simulations running on the Department’s supercomputers have increased in size and complexity.
US Geological Survey – Big Data for Earth System Science: USGS is announcing the latest awardees for grants it issues through its John Wesley Powell Center for Analysis and Synthesis. The Center catalyzes innovative thinking in Earth system science by providing scientists a place and time for in-depth analysis, state-of-the-art computing capabilities, and collaborative tools invaluable for making sense of huge data sets. These Big Data projects will improve our understanding of issues such as species response to climate change, earthquake recurrence rates, and the next generation of ecological indicators.”
Further details about each department’s or agency’s commitments can be found at the following websites by 2 pm today:
“I think that government is always going to need help, and that’s part of the message that we’re trying to spread… government not only will need help but will become an institution that lets people help, that encourages people to help out, and has a strong connection to the citizens its supposed to serve.”-Jennifer Pahlka, talking in a new interview with CNN on geeks helping open government.
Earlier this winter, Pahlka (aka @pahlkadot) delivered a TED Talk, “Coding a Better Government,” that now has over 300,000 views at TED.com and another 40,000+ at YouTube:
That talk and her SXSWi keynote — which was nearly three times as long and perhaps that much better — aren’t just about Code for America or civic coding or the impact of the Internet on society. It was about how we think about government and citizenship in the 21st century.
Jen’s voice is bringing the idea of civic coding as another kind of public service to an entire nation now. If America’s developer community really wakes up to help, city and state government IT could get better, quickly, as a network effect catalyze by the “Code for America effect takes off.
“To build resilient, peer-to-peer cities these precarious economic times demand, these conversations and collaborations need to be facilitated top-down, ground-up, and between every other decentralized community node that can contribute to weaving a diverse tapestry of a city’s political, cultural, historical, and socioeconomic data. …
To those of us who don’t think of ourselves as hackers but find ourselves applying that ethos to other trades—journalists, community organizers, field researchers, social justice activists, lawyers and policy wonks, and many more groups—let’s join the conversation, contribute our skills to the civic hacker community, and see what we can build together for our cities.”
If millions of non-coders collaborate with the geeks amongst us, learning from one another in the process, it could transform “hacking as a civic duty” from a geeky pursuit into something more existential and powerful:
21st century citizenship in which an ongoing digital relationship with government, services, smarter cities and fellow citizens is improved, negotiated and delivered through mobile devices, social media and open data.
If you’re following the intersection of citizens, technology and cities in the United States in 2012, the story of Chicago is already on your radar, as are the efforts of Code for America. This month, Code for America rolled out its brigades to start coding across America, including the Windy City.
These “brigades” are an effort to empower civic hackers to make apps and services that help their own communities. In Chicago, they’re calling themselves “IdeaHack.”
Below, I’ve embedded a story of their second meeting.
Data standards are the railway gauges of the 21st century. With more adoption of the ‘Green Button,’ are we about to see an explosion of innovation around energy data?
As with the Blue Button for healthcare data, the White House asserts that providing energy consumers with secure access to information about energy usage will increase innovation in the sector and empower citizens with more information.
“This is the kind of innovation that gets me excited,” wrote venture capitalist Fred Wilson earlier this year. “The Green Button is like OAuth for energy data. It is a simple standard that the utilities can implement on one side and web/mobile developers can implement on the other side. And the result is a ton of information sharing about energy consumption and in all likelihood energy savings that result from more informed consumers.”
The thinking here, as with Blue Button, which enables veterans (and soon all federal workers) to download their personal health data, is that broad adoption by utilities and engagement with industry will lead to new opportunities for software developers and civic entrepreneurs to serve a new market of millions of consumers who want better tools to analyze and manage their energy data.
To stimulate app creation, the U.S Department of Energy announced an Apps for Energy challenge today. This effort is meant to “change the way you think about your utility bill data,” wrote data integration specialist Matthew Loveless at the DoE blog:
With the Energy Department’s new Apps for Energy competition, we’re challenging developers to use the Green Button data access program to bring residential and commercial utility data to life.
The Energy Department – in partnership with Pacific Gas and Electric Company, Itron, and Gridwise Alliance – is offering $100,000 in cash prizes to the software developers and designers that submit the best apps, as judged by a prestigious panel of judges selected from government, the energy industry, and the tech community.
Apps for ENERGY leverages Green Button, an initiative that gives access to energy usage data in a streamlined and easy-to-understand format (learn more about the Green Button open standard here). In addition to leveraging Green Button, app developers are encouraged to combine data from a variety of sources to present a complete picture of the customer’s energy usage.
The competition is all about creating tools and products that help consumers get the most out of their Green Button data – from apps that track personal energy savings goals to software that helps businesses optimize building energy usage. In addition, the 27 million households that will have access to Green Button data by the end of the year represent an untapped market that can serve as a catalyst for an active, energy focused developer community.
Apps for Energy will join over one hundred other challenges on Challenge.gov next month.
Data from a new study on the use of Twitter by U.S. Senator and Representatives by public relations giant Edelman strongly suggests that the Grand Old Party has opened up a grand old lead in its use of the popular microblogging platform in just about every metric.
On Twitter’s 6th birthday, there’s more political speech flowing through tweets than ever. Twitter data from the study, as provided by Simply Measured, showed that on Twitter, Republican lawmakers are mentioned more, reply more often, are retweeted more, share more links to rich content and webpages, and reference specific bills much more often. Republicans tweet about legislation 3.5 times more than Democrats.
There are also more Republicans on Twitter: while the 89 U.S. Senators who tweet are evenly split, with one more Republican Senator tipping the balance, in the U.S. House there are 67 more Republican Representatives expressing themselves in 140 characters or less.
At this point, it’s worth noting that one of Twitter’s government leads in DC estimated earlier this year that only 15-20% of Congressional Twitter accounts are actually being updated by the Congressmen themselves, but the imbalance stands.
While the ways that governments deal with social media cannot be measured by one platform alone nor the activity upon it, the data in the embedded study below be of interest to many, particularly as the window for Congress to pass meaningful legislation narrows as the full election season looms this summer.
In the context of social media and election 2012, how well a Representative or Senator is tweeting could be assessed by whether they can use Twitter to build awareness of political platforms, respond to opposing campaign or, perhaps importantly for the purposes of the election, reach potential voters, help get them registered, and bring them to the polls
Outreach and transparency are both valuable to a healthy democracy, and to some extent, it is re-assuring that Twitter use is motivated by both reasons. An interesting counter-factual situation would be if the Republicans were the majority party. We may therefore ask in that situation: Is the desire to reach out to (opposing) voters strongest for “losing” parties? Our study certainly hints that Republicans are not only motivated to use Twitter as a means to reach out to their own followers, but also to Democrats, as they are more likely to use Twitter in cases where their district was overwhelmingly in favor President Barack Obama.
All-in-all, it would seem like Twitter is good for the whole Gov 2.0 idea. If Republicans are using Twitter as a means for outreach, then more bills may be passed (note: this has yet to be tested empirically, and still remains an open question for researchers). If Democrats are using Twitter as a means for transparency, then the public benefits from the stronger sense of accountability.
Yesterday, the Office of House Majority Leader Eric Cantor (R-VI) launched a new Facebook application, “Citizen Co-sponsor.” Rep. Cantor introduces it in the video below:
Since its introduction, I’ve been mulling over what to write about the new app. Here’s what I’ve read to date:
The app enables people to use Facebook to track the progress of House legislation as it makes its way through the chamber, but also provides the majority leader’s office with an interesting new grassroots marketing tool for the Republican party’s ideas.
The new app makes use of Facebook’s Open Graph protocol, which means that once installed, updates to legislation that a user has expressed support for can be automatically posted to their Facebook profiles. It also means that these updates show up in users’ timelines, newsfeeds and tickers, giving the legislation more exposure to users’ networks of friends.
For now, the list of legislation that citizens can choose to support is controlled, of course, by Cantor’s office and is listed on a section of his web site. Citizens can click to “co-sponsor” legislation that they support, and see all the other citizen co-sponsors who’ve expressed their support. Each widget for each piece of legislation also shows a visual storyline of that legislation’s progress through the House.
Second, a post by Alex Fitzpatrick at Mashable on the Facebook citizen cosponsor app , in which he interviewed Matt Lira, the director of digital for the House Majority Leader.
“We have a startup mentality to it,” says Lira. “When Twitter first started, it was just going to be for cell phones, now it is what it is today. It’s evolutionary, so you want to see how users use it and if the engagement justifies it, we’ll expand it out.”
The new media team at Cantor’s office is drawing inspiration from both sides of the aisle. Lira says he’s a fan of Rep. Issa’s (R-Calif.) Madison Project as well as the White House’s “We the People” online petitions. He talked about online bill markups, hearings and expert roundtables as possibilites for ways to expand the Citizen Cosponsor in the future.
“We want the program to give more to users than is asks of them,” says Lira. “The only way this stuff works is if you have a tolerance for experimentation and a certain level of patience. I’ve been impressed with We the People and that’s very experimental — it’s in the spirit of ‘let’s throw something out there and see if it works.’ Otherwise, there’s the alternative: a conference room of ideas that never happen.”
Over at the Huffington Post, POPVOX founder Marci Harris published a long post with substantive concerns about the citizens cosponsors app. (Disclosure: Tim O’Reilly was an early angel investor in POPVOX.) Harris wanted to know more about who the sponsors of the app are (it’s funded by the Office of the Majority Leader), whether feedback will go to a citizen’s Member of Congress, whether “updates” will be neutral or partisan, who will have access to the list of constituents that is generated by the app, the capability to only express support for a bill, versus opposition, and the privacy policy.
In late 2007 when I, as a staffer, shopped an idea around within Congress to create a public platform for constituent engagement, I discovered that it was nearly impossible to build something like that within the institution of Congress outside of the partisan caucus system. You could either build a Democratic-sponsored tool or a Republican-sponsored tool, but there was no structure for building a nonpartisan CONGRESSIONAL tool (and don’t even get me started on how impossible integration between House and Senate was/is.)* My experience does not mean that nonpartisan strides are impossible — just challenging, and that any effort should be viewed with a critical eye.
…why not use the publicly available data on all pending legislation and allow citizens to “co-sponsor” any bill currently being weighed by the legislature?
No matter how we feel about Facebook’s privacy provisions, we’ll be the first to admit that it is the default way to connect with people these ways. We’re not poo-poohing any initiative that harnesses social media that makes it easier for people to get involved in the political process, and we’re not bashing this from a partisan point of view. We’re bashing it from a point of view that cares about transparency.
Cantor’s ploy reeks of partisanship disguised as bipartisanship (nowhere on the main page of the site are the words “Democrat” or “Republican” used). And while the Cosponsor Project may be more participatory, it’s certainly not the “open, visible” platform he promises in his introduction.
That all adds up to a strong critique. As the app stands, however, it’s an important first step into the water for integration of Facebook’s social graph into legislation.
That said, there are some flaws, from an unclear Terms of Service to permissive data usage to a quite limited selection of bills that citizens can follow or support.
In addition, as a commenter on Mashable notes, “Unless there’s a way to show how many people are *against* proposed bills, this will not provide a clear picture as to the support they actually have. You might have a significant number of citizen cosponsors (say 25k), but that number loses its significance if the number of people against is, say 125k. You need both measures in order to get an idea as to whether or not a proposed bill is truly supported.”
I’ve asked Lira a number of followup questions and will file something for Radar if he responds. In the meantime, what do you think of the app and the initiative? Please let us know in the comments, keeping the following perspective from Harris in mind:
As with any startup, the first iteration is never perfect. Reid Hoffman, the founder of LinkedIn, famously said, “if you are not embarrassed by your first release, you’ve launched too late.” In that sense, maybe the Majority Leader is learning from the startup world. In an email response to my questions, Matt Lira, Director of New Media for Majority Leader Cantor, seemed to indicate that there were iterations to come: “As was the case when I publicly defended We the People, this is an evolutionary step – there will be continual progress, as with all these things, towards the desired end of a modernized Congress.”
Update: “We’ve always characterized both MADISON and Citizen CoSponsors as digital experiments that we are both admittedly excited about and that I personally believe have great potential to grow,” responded Matt Lira, director of digital for the House Majority Leader’s office, via email.
“These are the type of projects that will modernize our country’s legislative institutions for the social media age,” he wrote. “We are trying really new things like MADISON and Citizens. We are successfully driving institutional reforms on a structural basis. We are the same people who created docs.House.gov, require a public posting period for legislation, and established a machine-readable document standard. In short, people who have done more to open the House of Representatives than anyone in history.”
With respect to “e-partisanship,” Lira noted that “from the moment it launched, the app included a bill sponsored by a Democratic Representative. Some of the other bills – like the JOBS Act – have widespread support on both sides. I launched with six bills, because I wanted to see how the app works in the field, before making any choices about its wider deployment, should that even be justified.”
This post has updated to include a disclosure about Tim O’Reilly’s early investment in POPVOX.