Facebook’s new opaque political ads transparency site shows self-regulation isn’t enough

Alex_Howard_on_Twitter___So___Facebook_has_a_new_political_ad_transparency_site_https___t_co_EQoMMqDodq_You_can_t_get_to_it_unless_you_re_logged_into_Facebook__https___t_co_UbNjMZLbY

This past week, Facebook launched a new political ad transparency website. Facebook believes that “shining a light on ads” will increase transparency, which in turn “will lead to increased accountability & responsibility over time – not just for Facebook but advertisers as well.“

I think they’re right — which should be no surprise given my focus on advocating for more political transparency in Washington over the two years I spent at the Sunlight Foundation — but reviewing reports of unlabeled political ads is going to be hard.

Overall, this site is a welcome step towards more transparency, but misses the mark. The site only “exceeds expectations” if you think a search interface that exposes no underlying data is sufficient to inform the public and regulators.

In my initial assessment, I concur with journalists who found Facebook’s new political ad system is missing a lot, as ProPublica reported. (Please install ProPublica’s political ad collector so they can inform the public about how well Facebook’s tool actually works.)

https://platform.twitter.com/widgets.js

On the one hand, it was easy to use Facebook’s new archive of “ads with political content” – essentially a simple search tool for paid political ads that have run since May 7, 2018 – at least once I got on my laptop and logged into Facebook. I found recent ads that matched Trump, Clinton, gun control and corruption.

If you click on “see ad performance,” you can learn more about each ad.

If you click on the username, you arrive at the Page behind the ads. Unfortunately, there’s no tab for political ads or link to this archive. It’s hard to see how folks will find them, without it.

As I noted on Twitter, however, there’s one more critical wrinkle: you can’t get to the page unless you’re logged into Facebook!

This would be hilariously ironic, if it weren’t for the context of Russian interference and how Facebook handled it. Self-regulation is not enough.

As sociology professor Zeynep Tufecki noted, no one — whether member of the public, the press, watchdog, academic, regulator or legislator – should have to agree to Facebook’s Terms of Service and become a user to access political data.

To Facebook’s credit, the director of product at Facebook, Rob Leathern, responded publicly to Tufecki on Twittter, stating that this page is a first step:

“More ways are coming to make the ads with political content and information more accessible to people. One of those is an API, another is exploring opening the archive to people not on Facebook. We started with the Facebook community to see how they use the tool and gain feedback from third parties, including our newly-formed Election Commission. We’ll continue to update on our progress.”

If Facebook started with open data with no log-in, they could have gotten feedback from third parties like the Center for Responsive Politics or the public. No one should have to be part of Facebook’s “community” to understand who is buying electioneering on the platform, for whom, and what’s being shown.

As I commented to Leathern, if Facebook is only “exploring” making this archive open to people not on Facebook, then it is not implementing the Honest Ads Act, as its staff has claimed to Congress and the public. I asked Facebook to post a public ad file as bulk open data on the open Web.

Leathern told me that “we have prioritized getting the archive in the hands of people to use (as of today) + will follow up soon with an archive API. Thank you for the feedback, we are definitely listening.”

That’s good news, but not good enough.

Real transparency at Facebook will look like a public file of all paid political ads that are disclosed on a public website with bulk open data downloads and an API, none of which require the public to log into the site.

The good news is that I think Facebook understands this page as a start, not an end. In a post that closes matches what he told me, Leathern wrote that they’re “working closely” with a new “Election Commission” to launch an API for the archives.

It’s good news, but no deadline cited.

It’s hard for me not to be happy that Facebook is finally explicitly embracing political ad transparency in words and (some) deeds, including public soul searching about what constitutes a political ad and a policy.

That’s progress.

It’s just long overdue. Ultimately, elected representatives should be the ones to enact standards for transparency for political ads online after debate, not tech company executives.

Until Congress and other legislatures around the world empower regulators like Federal Election Commission by updating electioneering rules and enacting standards for disclaimers and disclosures, however, I’m glad to see positive actions.

I hope Facebook, its founder and its staff deliver on its most recent promises and their public obligations. Given past, current or predictable interference, opacity is unpatriotic.

RankAndFiled.com is like the SEC’s EDGAR database, but for humans

A new website, Rank and Filed, gathers data from the Security and Exchange Commission’s EDGAR database, indexes it, and publishes it online in open formats that  investors can use to research and discover companies. I’ve included a screenshot of Tesla’s SEC filings below.

tesla-rank-filed

The site currently has over 25 million files indexed.

I heard about the new website directly from its creator, Maris Jensen, a former SEC analyst who built the site independently. According to Maris, she proposed the project internally in March 2013 but was immediately turned down.

A month later, after she was terminated for threatening the Commission’s mission with a “lack of respect for senior management” — an issue she holds was unrelated to the proposal — Maris decided to make the idea become real independently and started building. She has since offered to give the site and its code to the SEC but has not heard back from them yet.

Our interview, lightly edited for content and clarity, follows.

20140219-201203.jpg

Where did the idea for this originate?

The breaking point was realizing that the guy in the cubicle across from me had spent a week writing the same parser as me — a Python program to parse the EDGAR FTP index for specific filings. This is nearly two decades after Carl Malamud set everything up; the FTP index is exactly as he left it. We were in the division responsible for the SEC’s data analytics and interactive data initiatives. The division literally rewrites this program each time they need SEC filings data. There’s no version control. There’s just no excuse!  Hilariously, that guy also left the SEC and built an SEC filings website, though his is for-profit: http://legalai.com/

What does this do that the SEC needed?

In 2008, the SEC set up a task force (the ‘21st Century Disclosure Initiative‘) to rethink the way they were making data available to the public. A year later, they published this report, with their conclusion and proposal for a new, modernized disclosure system.  I basically just tried to build the system they described. I also did lots of googling — ‘SEC EDGAR tool terrible‘, ‘how to find SEC data‘, etc — and then tried to address the problems people were having.

The problems have been the same for decades. In 1994, people wanted a SEC CIK-to-ticker mapping. 20 years later, this question still pops up on forums monthly.

There are over 600 different forms on EDGAR but the SEC’s form lists are basically no help at all. I went through and googled each form individually. I tried to group them into understandable categories.

The comment at the bottom of this post describes the SEC’s current problem better than I ever could:

Has anyone out there ever tried to use SEC.GOV to search for information about a company? The problem is very easy to articulate. If you search for something, you get 5000 results. At about 10 results per page, you have 500 pages to sift through to find what you want. Once you find what you want, there is ZERO ability to navigate from what you found into related documents!

What if you want to research a particular company’s board of directors? What other companies is each director associated with? Have there been any problems in any of those companies? You can’t investigate these types of things using the technology sec.gov has fielded. You want a needle. The SEC gives you a haystack.

Why not allow for better discovery of all of the SEC data and let investors perform their own investigations of markets & companies?

So instead of focusing on this obvious improvement to the public service the SEC provides, the emphasis apparently is on improving investigative actions. Great. Why not just shut off the sec.gov website completely and let the SEC do all of the investigating and researching of SEC data?

How does RankAndFiled.com compare to other sources of SEC data online?

I unfortunately haven’t added that much ‘value’ yet. I’m a total amateur. I’m just trying to make the data available and understandable! The website doesn’t do any analysis: it just collects, links and presents data from different SEC filings.

Looks like you got some great help from the folks you thanked. Did you build this all yourself with these tools?

Yes, open source tools these days are amazing!!  I started this project with no web or software development experience at all.

I actually feel really lucky to have fallen into all of this. Everything I know I learned on google, mostly through tutorials written by the developers listed there.

I also didn’t know anyone in the dataviz or open source community, so I reached out to some of them with stuff like etiquette questions. Their response and support was just incredible — especially the D3 community, they’re just wonderful.

Can you tell me more about where the data on this site comes from and what you’ve done to it?

Basically, the system watches the SEC’s RSS feeds. It reads and indexes data from SEC filings as they come in. Not all the filings show up on the feeds — I’m not sure why — so it also scans the FTP index for any missed filings.

About 25 million SEC documents have been parsed and incorporated so far, which is everything that’s publicly available on EDGAR.  So companies and people are tracked and connected over time — who’s raising money where, who owns whom, who moved companies or got promoted, who sold a ton of shares.  I also realign all the financial data from quarterly and annual reports so you can see a company’s financial history and so the data is comparable between companies.

It actually feels silly even talking about it, because it’s just so basic. This is stuff the SEC should have been doing years and years and years ago.

But its not a perfect science because one, only a few SEC forms are machine-readable and two, the SEC doesn’t even try to standardize names. SEC registrants are given distinct identifiers but anything goes when companies or names are listed inside a filing. Middle names, middle initials, nicknames, suffixes, titles…

What’s next?

I spent November and December trying to give all my code to the SEC. I received no response, not even a polite no. That’s still the goal — I want them to take over and open source it, or at the very least host the underlying API.  It’s their job to make this data available and accessible. They NEED a team over there doing hands-on work with SEC filings, a team struggling to make sense of this data with just the tools available to retail investors, especially now that they’re talking about disclosure reform.  Right now, they have almost no incentive to change things over to structured data — they buy all the structured EDGAR data they need.

The SEC keeps saying that it’s the private sector’s job to build tools like this, not theirs, but in the past 20 years nobody has come up with a really great, really affordable option.  It doesn’t make sense for any of us to even try — I’ve heard that Bloomberg and Thomson Reuters hire legions of Indian professionals to go through each SEC filing by hand.  We just can’t compete.

The SEC will have to make a lot more of their data machine-readable before any ‘disruptive’ innovation can happen, but they won’t do that until they’re forced to (by Congress), unless they have people there who realize how unfair the situation has become.

There are actually a heartbreaking number of SEC employees who also want this to happen, self-described worker bees who’ve reached out to me from personal email to say they’ve been trying to convince their bosses to give this thing a chance.  So far, no luck! I would open source it myself, but unfortunately I can’t afford to host the project indefinitely.