On data journalism, accountability and society in the Second Machine Age

On Monday, I delivered a short talk on data journalism, networked transparency, algorithmic transparency and the public interest at the Data & Society Research Institute’s workshop on the social, cultural & ethical dimensions of “big data”. The forum was convened by the Data & Society Research Institute and hosted at New York University’s Information Law Institute at the White House Office of Science and Technology Policy, as part of an ongoing review on big data and privacy ordered by President Barack Obama.

Video of the talk is below, along with the slides I used. You can view all of the videos from the workshop, along with the public plenary on Monday evening, on YouTube or at the workshop page.

Here’s the presentation, with embedded hyperlinks to the organizations, projects and examples discussed:

For more on the “Second Machine Age” referenced in the title, read the new book by Erik Brynjolfsson and Andrew McAfee.

Berkman Center maps networked public sphere’s role in SOPA/PIPA debate

berkman-sopa-paper
A new paper from Yochai Benkler and co-authors at the Berkman Center maps how the networked public sphere led to the Stop Online Piracy Act and Protect IP Act being defeated in the U.S. Congress.

“Abstract: “This paper uses a new set of online research tools to develop a detailed study of the public debate over proposed legislation in the United States designed to give prosecutors and copyright holders new tools to pursue suspected online copyright violations.”

Key insight: “We find that the fourth estate function was fulfilled by a network of small-scale commercial tech media, standing non-media NGOs, and individuals, whose work was then amplified by traditional media. Mobilization was effective, and involved substantial experimentation and rapid development. We observe the rise to public awareness of an agenda originating in the networked public sphere and its framing in the teeth of substantial sums of money spent to shape the mass media narrative in favor of the legislation. Moreover, we witness what we call an attention backbone, in which more trafficked sites amplify less-visible individual voices on specific subjects. Some aspects of the events suggest that they may be particularly susceptible to these kinds of democratic features, and may not be generalizable. Nonetheless, the data suggest that, at least in this case, the networked public sphere enabled a dynamic public discourse that involved both individual and organizational participants and offered substantive discussion of complex issues contributing to affirmative political action.”

One data set, however, was missing from the paper: the role of social media, in particular Twitter, in reporting, amplifying and discussing the bills. The microblogging platform connected many information nodes mapped out by Berkman, from hearings to activism, and notably did not shut down when much of the Internet “blacked out” in protest.

The paper extends Benkler’s comments on a networked public commons from last year.

As I wrote then, we’re in unexplored territory. We may have seen the dawn of new era of networked activism and participatory democracy, borne upon the tidal wave of hundreds of millions of citizens connected by mobile technology, social media platforms and open data.

As I also observed, all too presciently, that era will also include pervasive electronic surveillance, whether you’re online and offline, with commensurate threats to privacy, security, human rights and civil liberties, and the use of these technologies by autocratic government to suppress dissent or track down dissidents.

Finding a way for forward will not be easy but it’s clearly necessary.

12 lessons about social media, politics and networked journalism

In 2011, I was a visiting faculty member at the Poynter Institute, where I talked with a workshop full of journalists about working within a networked environment for news. As I put together my talk, I distilled the lessons I’d learned from my experiences covering tech and the open government initiative that would affect the success of any audience relationship and posted them onto Google+. Following is an adapted and updated version of those insights. The Prezi from the presentation is online here.

1) We have to change our idea of “audience.”

People are no longer relegated to being the passive recipients of journalists’ work. They have often creators of content and have become important nodes for information themselves, sometimes becoming even more influential within their topical or regional communities than journalists are. That means we have to treat them differently. Yes, people are reading, watching or listening to the work of journalists but they’re much more than an “audience.”

In the 21st century, the intersection of government, politics and media is increasingly a participatory, reciprocal and hypersocial experience due to the explosion in adoption of connected smartphones that turn citizens into publishers, broadcasters and human sensors – or censors, depending upon the context. More than half of American adults have a smartphone in 2013. The role of editors online now includes identifying and debunking misinformation, sifting truth from fiction, frequently in real-time. The best “social media editors” are creating works of journalism from a raw draft of history contributed by the reports of the many.

2) Good conversations involve talking and listening.

Communicating effectively in networked environments increasingly involves asking good questions that elicit quality responses — the more specific the question, the better the chance for a quality response. The Obama administration’s open government initiative’s initial use of the Internet in 2009, at Change.gov, did not ask highly structured questions, which led to a less effective public consultation.

3) The success of any conversation depends upon how well we listen.

Organizations that invite comments and then don’t respond to audience comments or questions send a clear message — “we’re not listening.” There are now many ways there are to listen and a proliferation of channels, going far beyond calls and email.

Comments have become distributed across the Internet and social Web. People are not just responding to those made on a given article or post: they’re on Twitter, Facebook and potentially other outposts. Find where people are talking about your beat, organization or region: that’s your community. Some organizations are using metrics to determine not only how often sentiments are expressed but the strength of that conviction and the expertise behind it.

4) No matter how good the conversation, its hosts must close the loop.

When the host of a conversation, be it someone from government, school, business or media, asks someone’s opinion, but doesn’t acknowledge it, much less act upon it, the audience loses trust.

If we seek audience expertise but don’t subsequently let it inform our work, the audience loses trust. Increasingly, to gain and hold that trust you must demonstrate the evidence behind your assertions by citation, with research tied via footnotes or hyperlinks, source code or supporting data.

It’s better not to ask than to ask and not act upon the answer. It’s similarly better not to engage in social media at all than to perpetuate the same old one-way communication streams with legacy broadcast behaviors. There are also new risks posted by the combination of ubiquitous connected mobile devices and the global reach of social media networks. To paraphrase Mark Twain, it is better to be thought a fool than to tweet and prove it.

5) You must know who your audience is and where, why, when and how they’re searching for information to engage them effectively.

TechTarget, one of my former employers, successfully segmented its traditional IT audience into niches that cared passionately about specific technology and/or issues. The company then developed integrated media products around highly specific topical area, a successful business model, albeit one that has specialized applicability to the news business. Politico’s approach, which now includes live online video, paid subscriber content for “pros,” policy segmentation, email newsletters and events, is the most apt comparison in the political space, although there are many other trade publications that cater to niche audiences.

Here’s the key for both specific audiences: IT buyers have decision-making ability over thousands, if not millions, of dollars in budgets. Policy makers in DC have similar authority on appropriates, legislation or regulation.

Most general readers do not have budget authority nor policy cout and therefore will not sustain an effective business model. If you can create content that is of interest to people with buying power, then sponsors/advertisers will bite. The model, in other words is not a panacea.

6) Your audience should be able to find and hear from YOU.

It matters whether the person whose name is on a social media account actually engages in it. For instance, President Obama doesn’t directly use social media, with a few notable exceptions. His White House and campaign staff do, at @WhiteHouse and @BarackObama. Some GOP candidates and incumbents actually maintain their accounts. If you take away the president, the GOP is ahead in both houses of Congress. They have attracted huge followings.

Why does a personal account to complement the masthead matter? It stays with the reporter or editor from job to job. While many networks or papers have adopted naming conventions that immediately identify a journalist’s affiliation (@NameCBS or @NYT_Name) that practice does create a gray area in terms of who “owns” the account. @OctaviaNasrCNN was able to drop the CNN and keep her account. @CAmanpour was able to transfer from CNN to ABC. Even within networks, there is a lack of standardization: Compare @DavidGregory or @JakeTapper to @BetsyMTP.

7) People respond differently to personal accounts than mastheads.

Andy Carvin taught me about this dynamic years ago, which I’ve since seen borne out in practice. He compared the results he’d get from asking questions on his personal account (@acarvin) to a primary NPR accounts (@NPRNews) and found that people responded to him more. They followed and viewed the news account (more) as a feed for information. The White House @OpenGov dCTO account explored by creating her account, @BethNoveck, and found similar results. Incidentally, she then was able to keep that account after she left public service.

8) Better engagement with the audience requires the media to change established traditions and behaviors.

How many reporters still do not RT their competition’s stories, whether they beat them to a story or not? The best bloggers tend to be immense linkers and sharers. This is much like the decades-old question of whether a given newsroom’s website links to stories done by competitors or not. This behavior now has increasing consequences for algorithmic authority in both search engines (SEO) and social networks (SMO.) If we aspire to hosting the conversation around an issue, do we now have a responsibility need to point our audience at all the perspectives, data, sources and analysis that would contribute to an understanding of that issue? What happens if competitors or new media enterprises, like the Huffington Post, create an expectation for that behavior?

A good aspirational goal is to be a hub for a given beat, which means linking, RT’ing or sharing relevant information in a source-agnostic manner. If the beat is a given campaign, statehouse, policy area or geography.

9) Data-driven campaigns create more of a need for data-driven journalism.

Social media is important.. In Election 2012, social, location, mobile and campaign data — and how we use it — proved to be an equally important factor. Nate Silver pulled immense audiences to his 538 blog at the New York Times. Online spreadsheets, visualizations, predictive models, sentiment analysis, and mobile and/or Web apps are all part of the new ‘data journalism’ lexicon, as well as an emerging ‘newsroom stack’

Why? President Obama’s reelection campaign invested heavily in data collection, science and analysis for 2012. Others will follow in the years ahed. Republicans are investing in data but are appear to be behind, in terms of their capacity for data science. This may change in future cycles.

Government social media use continues to grow. More than 75% of Congress is using social media now. Freshmen Congressman in the House start the terms in office with a standard palate of platforms: Drupal for their website, Twitter, Facebook, Flickr and YouTube for constituent communications. By mid-2010, 22 of 24 Federal agencies were on Facebook. This trend will only continue at the state and local level.

10) What are governments learning from their attempts?

They’re behind but learning. From applying broadcast models to adopting new platforms, tools for listening, archiving, campaigning vs governing, personal use versus staffers, linking or sharing behaviors, targeted consultations, constituent identity, privacy and security policies, states and cities are moving forward into the 21st century. Slowly.

11) Know your platforms, their utility, demographics and conventions.

Facebook is gigantic. You cannot ignore it if you’re looking for the places people congregate online. That said, if you’re covering politics and breaking news, Twitter remains the new wire for news. It’s still the backchannel for events. It’s not an ideal place to host conversations because of issues with threaded conversations, although third party tools and conventions have evolved that make regular discussions around #hashtags possible. Google+ is much better for hosting hangouts and discussions, as are modern blog comment platforms like Disqus. Facebook fits somewhere in between the two for conversation: you can’t upvote comments and it requires readers to have a Facebook account – but the audience is obviously immense.

12) Keep an eye out for what’s next and who’s there.

Journalists should be thinking about Google+ in terms of both their own ‘findability’ and that of their stories in search results. The same is true for Facebook and Bing integration. Watch stats from LinkedIn as a source or forum for social news. Reddit has evolving into a powerful platform for media and public figures to host conversations. StumbleUpon can send a lot of traffic to you.

The odds are good that there are influential blogs with many readers who are covering your beat. Know the most important ones and their writers, link to them, RT their work and comment upon them. More services will evolve, like communities around open data, regional hubs for communities themselves, games and hybrids of location-based networks. Have fun exploring them!