Internet Caucus to host forum in DC on open data with Zillow CEO

SDC-Zillow-Keynote-demo-dc6656

Tomorrow, the Internet Caucus is hosting a forum on open data in the United States Congress that will feature a conversation between Zillow CEO Spencer Rakoff and  yours truly.

Open government data powers Zillow’s ability to give consumers more insight into the real estate market. They are a clear winner in the open data economy, an early beneficiary of federal government releases of data that could one day add trillions of dollars in economic value, better services, resilience against climate change, accountability, and social justice. Tomorrow, we’ll talk about the potential and challenges of opening up data about housing and making the real estate market more transparent.

If you have questions about Zillow, open data, startups, real estate or other counts, please let me know. 

[Image Credit: Zillow]

On data journalism, accountability and society in the Second Machine Age

On Monday, I delivered a short talk on data journalism, networked transparency, algorithmic transparency and the public interest at the Data & Society Research Institute’s workshop on the social, cultural & ethical dimensions of “big data”. The forum was convened by the Data & Society Research Institute and hosted at New York University’s Information Law Institute at the White House Office of Science and Technology Policy, as part of an ongoing review on big data and privacy ordered by President Barack Obama.

Video of the talk is below, along with the slides I used. You can view all of the videos from the workshop, along with the public plenary on Monday evening, on YouTube or at the workshop page.

Here’s the presentation, with embedded hyperlinks to the organizations, projects and examples discussed:

For more on the “Second Machine Age” referenced in the title, read the new book by Erik Brynjolfsson and Andrew McAfee.

Representative Quigley introduces updated Transparency in Government Act (TGA)

Earlier today, Congressman Mike Quigley (D-IL) introduced a comprehensive open government transparency bill on the floor of the United States House of Representatives. The aptly titled “Transparency in Government Act” (PDF) (summary) coincides with Sunshine Week, the annual effort to stimulate a national dialogue about the iopen government and freedom of information.

“The public’s trust in government has reached historic lows, causing many Americans to simply give up on Washington,” said Representative Quigley. “But the mission of government matters, and we can’t lead in the face of this deficit of trust. The Transparency in Government Act shines a light on every branch of the federal government, strengthening our democracy and promoting an efficient, effective and open government.”

As it has in its previous two iterations, the transparency bill has received strong support from most of the major government watchdog and transparency groups in Washington, including Citizens for Responsibility and Ethics in Washington (CREW), the Sunlight Foundation, Data Transparency Coalition, the Center for Responsive Politics, the Center for Effective Government, the Project on Government Oversight (POGO) and the Electronic Privacy Information Center (EPIC).

As Matt Rumsey noted at the Sunlight Foundation blog, this iteration of TGA is the third version to be introduced since 2010:

As we noted at the time, the original bill was inspired in part by model transparency legislation put together on PublicMarkup.org, a project of the Sunlight Foundation.

The 2014 version of the TGA includes a number of Sunlight Foundation priorities including, but not limited to, enhanced access to the work of congressional committees and Congressional Research Service reportsimprovements to the current lobbying disclosure regime as well as increased transparency in federal contracting, grants and loans.

The prospects for TGA to pass through the entire House don’t appear to be much better than the prior two versions. That said, as CREW policy director Daniel Schuman wrote today, the bill is a deep reservoir of transparency ideas that Congress can draw upon to amend other legislation or introduce as stand-alone bills:

  • Greater congressional accountability through improved disclosure of foreign travel reports, gift reports, how members of Congress spend their official budgets, and greater disclosure of personal financial information.
  • Greater congressional transparency through improved access to the work of committees (including meeting schedules and transcripts) and greater contextualization of floor votes.
  • Empowering public understanding of congressional work through public access to Congressional Research Service reports.
  • Better tracking of lobbying by broadening the definition of lobbyist, improving the tracking of lobbying activity (in part through the use of unique entity identifiers), and more frequent disclosures by lobbyists of political contributions; improved access to information on lobbying on behalf of foreign entities; and public access to statements by grantees and contractors certifying that they have not used money awarded by the federal government to lobby (the SF-LLLs).
  • Enhancing transparency for contracts, grants, and loans through improved data quality, better disclosure (including electronic) and improved compliance.
  • Making the executive branch more transparent by requiring online access to White House and executive branch agency visitor logs, providing centralized access to agency budget justifications, and allowing the public to see how the Office of Management and Budget OIRA changes draft agency regulations.
  • Improving transparency of non-profit organizations by requiring non-profit tax forms (990s) to be available online in a central location (replacing the current ad hoc disclosure system).
  • Improving the Freedom of Information Act by publishing completed requests online in a searchable database and requiring notice of efforts to carve out exemptions to FOIA. (Ourrecommendations go even further.)
  • Opening up federal courts by requiring live audio of Supreme Court hearings, publishing federal judicial financial disclosures online, requiring a Government Accountability Office study on the impact of live video-streaming Supreme Court proceedings, and requiring a GAO audit of PACER.
  • Require annual openness audits by GAO that look at whether data made available by the government meets the eight open data principles.

In aggregate, this is a bright beam of sunshine from Congress that everyone should stand behind, from citizens to legislators to advocates. The Project for Government Oversight is strongly supportive of its provisions, writing that “there is a lot to like in this bill, including more transparency for Congress, lobbying, the executive branch, and federal spending on contractors and grantees.”

Taken one by one, the individual provisions in the bill are well worth considering, one by one, from bringing the Supreme Court into the 21st century to FOIA reform.

If Representative Quigley’s bill can attract the attention of Congressional leaders and legislators across the aisle who have professed support for open government and transparency, maybe some more of these provisions will move forward to enter the Senate, though that body has shown little appetite for moving legislation forward in the 113th Congress to date.

Federal government agencies receive .91 GPA in FOIA compliance from Center for Effective Government

Today, the Center for Effective Government released a scorecard for access to information from the 15 United States federal government agencies that received the most Freedom of Information Act (FOIA) requests, focusing upon an analysis of their performance in 2013.

The results of the report (PDF) for the agencies weren’t pretty: if you computed a grade point average from this open government report card (and I did) the federal government would receive a D for its performance. 7 agencies outright failed, with the State Department receiving the worst grade (37%).

The grades were based upon:

  1. How well agencies processed FOIA requests, including the rate of disclosure, fullness of information provided, and timeliness of the response
  2. How well the agencies established rules of information access, including the effectiveness of agency polices on withholding information and communications with requestors
  3. Creating user-friendly websites, including features that facilitate the flow of information to citizens, associated online services, and up-to-date reading rooms

The report is released at an interesting historic moment for the United States, with Sunshine Week just around the corner. The United States House of Representatives just unanimously passed a FOIA Reform Act that is substantially modeled upon the Obama administration’s proposals for FOIA reforms, advanced as part of the second National Open Government Action Plan. If the Senate takes up that bill and passes it, it would be one of the most important, substantive achievements in institutionalizing open government beyond this administration.

The Citizens for Responsibility and Ethics in Washington have disputed the accuracy of this scorecard, based upon the high rating for the Department of Justice. CREW counsel Anne Weismann:

It is appropriate and fair to recognize agencies that are fulfilling their obligations under the FOIA. But CEG’s latest report does a huge disservice to all requesters by falsely inflating DOJ’s performance, and ignoring the myriad ways in which that agency — a supposed leader on the FOIA front — ignores, if not flouts, its obligations under the statute.

Last Friday, I spoke with Sean Moulton, the director of open government policy at the Center for Effective Government, about the contents of the report and the state of FOIA in the federal government, from the status quo to what needs to be done. Our interview, lightly edited for content and clarity, follows.

What was the methodology behind the report?

Moulton: Our goal was to keep this very quantifiable, very exact, and to try and lay out some specifics. We thought about what the components were necessary for a successful FOIA program. The processing numbers that come out each year are a very rich area for data. They’re extremely important: if you’re not processing quickly and releasing information, you can’t be successful, regardless of other components.

We did think that there are two other areas that are important. First, online services. Let’s face it, the majority of us live online in a big way. It’s a requirement now for agencies to be living there as well. Then, the rules. They’re explained to the agencies and the public, in how they’re going to do things when they get a request. A lot of the agencies have outdated rules. Their current practices may be different, and they may be doing things that the rules don’t say they have to, but without them, they may stop. Consistent rules are essential for consistent long term performance.

A few months back, we released a report that laid out what we felt were best practices for FOIA regulations. We went through a review of dozens of agencies, in terms of their FOIA regulations, and identified key issues, such as communicating with the requester, how you manage confidential business information, how you handle appeals, and how you handle timelines. Then we found inside existing regulations the best ways this was being handled. It really helped us here, when we got to the rules. We used that as our roadmap. We knew agencies were already doing these things, and making that commitment. The main thing we measured under the rules were the items from that best practices report that were common already. If things were universal, we didn’t want to call a best practice, but a normal practice.

Is FOIA compliance better under the Obama administration, more than 4 years after the Open Government Directive?

Moulton: In general, I think FOIA is improving in this administration. Certainly, the administration itself is investing a great deal of energy and resources in trying to make greater improvements in FOIA, but it’s challenging. None of this has penetrated into national security issues.

I think it’s more of a challenge than the administration thought it would be. It’s different from other things, like open data or better websites. The FOIA process has become entrenched. The biggest open government wins were in areas where they were breaking new ground. There wasn’t a culture or way of doing this or problems that were inherited. They were building from the beginning. With FOIA, there was a long history. Some agencies may see FOIA as some sort of burden, and not part of their mission. They may think of it as a distraction from their mission, in fact. When the Department of Transportation puts out information, it usually gets used in the service of their mission. Many agencies haven’t internalized that.

There’s also the issue of backlogs, bureaucracy, lack of technology or technology that doesn’t work that well — but they’re locked into it.

What about redaction issues? Can you be FOIA compliant without actually honoring the intent of the request?

Moulton: We’re very aware of this as well. The data is just not there to evaluate that. We wish it was. The most you get right now is “fully granted” or “partly granted.” That’s incredibly vague. You can redact 99% or 1% and claim it’s partially redacted, either way. We have no indicator and no data on how much is being released. It’s frustrating, because something like that would help us get a better sense on whether agencies would benefit would new policies

We do know that the percentage of full grants has dropped every year, for 12 years, from the Clinton administration all the way through the Bush administration to today. It’s such a gray area. It’s hard to say whether it’s a terrible thing or a modest change.

Has the Obama administration’s focus on open government made any difference?

Moulton: I think it has. There were a couple of agencies that got together on FOIA reform. The EPA led the team, with the U.S. National Archives and the Commerce Department, to build a new FOIA tool. The outward-facing part of the tool enables a user to go to a single spot, request and track it. Other people could come and search FOIA’ed documents. Behind the scenes, federal workers could use the tool to forward requests back and forth. This fits into what the administration has been trying to do, using technology better in government

Another example, again at the EPA, is where they’ve put together a proactive disclosure website. They got a lot of requests, like if there are inquiries about properties, environmental history, like leaks and spills, and set up a site where you could look up real estate. They did this because they went to FOIA requests and see what people wanted. That has cut down their requests to a certain percentage.

Has there been increasing FOIA demand in recent years, affecting compliance?

Moulton: I do think FOIA requests have been increasing. We’ll see what this next year of data shows. We have seen a pretty significant increase, after a significant decrease in the Bush administration. That may be because this administration keeps speaking about open government, which leads to more hopeful requestors. We fully expect that in 2013, there will be more requests than the prior year.

DHS gets the biggest number of all, but that’s not surprising when we look at the size of it. It’s second biggest agency, after Defense, and the biggest domestic facing agency. when you start talking about things like immigration and FEMA, which go deep into communities and people’s lives, in ways that have a lot impact, that makes sense.

What about the Department of Justice’s record?

Moulton: Well, DoJ got the second highest rating, but we know they have a mixed record. There are things you can’t measure and quantify, in terms of culture and attitude. I do know there were concerns about the online portal, in terms of the turf war between agencies. There were concerns about whether the tech was flexible, in terms of meeting all agency needs. If you want to build a government-wide tool, it needs to have real flexibility. The portal changed the dialogue entirely

Is FOIA performance a sufficient metric to analyze any administration’s performance on open government?

Moulton: We should step back further and look at the broader picture, if we’re going to talk about open government. This administration has done things, outside of FOIA, to try to open up records and data. They’ve built better online tools for people to get information. You have to consider all of those things.

Does that include efforts like the Intelligence Community Tumblr?

Moulton: That’s a good example. One thing this administration did early on is to identify social media outlets. We should be going there. We can’t make citizens come to us. We should go to where people are. The administration pushed early on that agencies should be able to use Tumblr and Twitter and Facebook and Flickr and so on.

Is this social media use “propaganda,” as some members of the media have suggested?

Moulton: That’s really hard to decide. I think it can result in that. It has the potential to be misused to sidestep the media, and not have good interaction with the media, which is another important outlet. People get a lot of their information from the media. Government needs to have good relationship.

I don’t think that’s the intention, though, just as under Clinton, when they started setting up websites for the first time. That’s what the Internet is for: sharing information. That’s what social media can be used for, so let’s use what’s there.

Presidential Innovation Fellows show (some) government technology can work, after all

The last six months haven’t been kind to the public’s perception of the Obama administration’s ability to apply technology to government. The administration’s first term that featured fitful but genuine progress in modernizing the federal government’s use of technology, from embracing online video and social media to adopting cloud computing, virtualization, mobile devices and open source software. The Consumer Financial Protection Bureau earned praise from The Washington Post, Bloomberg View, and The New York Times for getting government technology right.

Last fall, however, the White House fell into a sinkhole of its own creation when the troubled launch of Healthcare.gov led to the novel scene of a President of the United States standing in the Rose Garden, apologizing for the performance of a website. After the big fix to Healthcare.gov by a quickly assembled trauma team got the site working, the administration has quietly moved towards information technology reforms, with the hopes of avoiding the next Healthcare.gov, considering potential shifts in hiring rules and forming a new development unit within the U.S. General Services agency.

Without improved results, however, those reforms won’t be sufficient to shift the opinion of millions of angry Americans. The White House and agencies will have to deliver on better digital government, from services to public engagement.

pif-logo-300pxThis week, the administration showed evidence that it has done so: The projects from the second round of the White House’s Presidential Innovation Fellows program are online, and they’re impressive. US CTO Todd Park and US GSA Administrator Dan Tangherlini proudly described their accomplishments today:

Since the initiative launched two years ago, Presidential Innovation Fellows, along with their government teammates, have been delivering impressive results—at start-up velocity. Fellows have unleashed the power of open government data to spur the creation of new products and jobs; improved the ability of the Federal government to respond effectively to natural disasters; designed pilot projects that make it easier for new economy companies to do business with the Federal Government; and much more. Their impact is enormous.

These projects show that a relatively small number of talented fellows can work with and within huge institutions to rapidly design and launch platforms, Web applications and open data initiatives. The ambition and, in some cases, successful deployment of projects like RFPEZ, Blue Button Connect, OpenFDA, a GI Bill toolGreen Button, and a transcription tool at the Smithsonian Institute are a testament to the ability of public servants in the federal government to accomplish their missions using modern Web technologies and standards. (It’s also an answer to some of the harsh partisan criticism that the program faced at launch.)

In a blog post and YouTube video from deputy U.S. chief technology officer Jennifer Pahlka, the White House announced today they had started taking applications for a third round of fellows that would focus on 14 projects within three broad areas: veterans, open data and crowdsourcing:

  • “Making Digital the Default: Building a 21st Century Veterans Experience: The U.S. Department of Veterans Affairs is embarking on a bold new initiative to create a “digital by default” experience for our Nation’s veterans that provides better, faster access to services and complements the Department’s work to eliminate the disability claims backlog.
  • Data Innovation: Unleashing the Power of Data Resources to Improve Americans’ Lives: This initiative aim to accelerate and expand the Federal Government’s efforts to liberate government data by making these information resources more accessible to the public and useable in computer readable forms, and to spur the use of those data by companies, entrepreneurs, citizens, and others to fuel the creation of new products, services, and jobs.
  • By the People, for the People: Crowdsourcing to Improve Government: Crowdsourcing is a powerful way to organize people, mobilize resources, and gather information. This initiative will leverage technology and innovation to engage the American public as a strategic partner in solving difficult challenges and improving the way government works—from helping NASA find asteroid threats to human populations to improving the quality of U.S. patents to unlocking information contained in government records.”

Up until today, the fruits of the second class of fellows have been a bit harder to ascertain from the outside, as compared to the first round of five projects, like RFPEZ, where more iterative development was happening out in the open on Github. Now, the public can go see for themselves what has been developed on their behalf and judge for themselves whether it works or not, much as they have with Healthcare.gov.

I’m particularly fond of the new Web application at the Smithsonian Institute, which enables the public to transcribe handwritten historic documents and records. It’s live at Transcription.si.edu, if you’d like to pitch in, you can join more than three thousand volunteers who have already transcribed and reviewed more than 13,000 historic and scientific records. It’s a complement to the citizen archivist platform that the U.S. National Archives announced in 2011 and subsequently launched. Both make exceptional use of the Internet’s ability to distribute and scale a huge project around the country, enabling public participation in the creation of a digital commons in a way that was not possible before.

U.S. House unanimously votes in favor of FOIA reform and a more open government

Earlier tonight, The United States House of Representatives voted 410-0 to pass the FOIA Oversight and Implementation Act. If the FOIA Act passes through the Senate, the bill would represent the most important update to United States access to information laws in generations.

“Transparency in government is a critical part of restoring trust and the House will continue to work to make government more transparent and accessible to all Americans,” said House Majority Leader Eric Cantor (R-VI). “By expanding the FOIA process online, the FOIA Oversight and Implementation Act creates greater transparency and continues our open government efforts in the House.”

The FOIA Oversight and Implementation Act (FOIA), ‪‎H.R.1211‬, is one of the best opportunities to institutionalize open government in the 113th Congress, along with the DATA Act, which passed the House of Representatives 388-1 last November.

The FOIA reform bill now moves to the Senate, which passed unanimous FOIA reform legislation in the last Congress.

As Nate Jones detailed at the National Security Archive, the Senate’s own legislative effort to reform FOIA, the so-called the “Faster FOIA Act” (S.627S. 1466), was not picked up by the House: the open government bill was hijacked in service of a 2011 budget deal, where the FOIA provisions in it ultimately met an untimely end. Chairman Darrell Issa (R-CA.), Ranking Member Elijah Cummings (D-MD), and Representative Mike Quigley (D-IL) chose to draft their own bill instead of taking that bill up again.

Open government advocates applauded the unanimous passage of the FOIA Act, although there are some caveats about its provisions for the Senate to consider.

“This vote shows strong congressional support for government transparency and the Freedom of Information Act,” said Sean Moulton, Director of Open Government Policy at the Center for Effective Government, in a statement:

Since its original passage nearly 50 years ago, FOIA has been a cornerstone of the public’s right to know. By modernizing FOIA, H.R. 1211 would improve Americans’ ability to access public information and strengthen our democracy.

We thank the chair and ranking member of the House Committee on Oversight and Government Reform, Reps. Darrell Issa (R-CA) and Elijah Cummings (D-MD), who worked with the open government community to develop this legislation in a bipartisan fashion. We urge the Senate to advance legislation addressing these issues and other pressing FOIA reforms, including the need to rein in secrecy claims under Exemption 5, which restrict access to important information about government operations.

Access to public information is crucial to our democracy and the government’s effectiveness. It allows Americans to actively engage in policymaking in a thoughtful, informed manner and to hold public officials accountable for decisions that impact us all.

The bill represents important incremental, improvements to the FOIA process, but “it doesn’t address some fundamental shortfalls in the way that the FOIA is implemented and viewed within the Federal government,” wrote Matt Rumsey, policy analyst at the Sunlight Foundation:

… A “presumption of openness” and improved online infrastructure are important, but the bigger challenge will be getting agencies to change their posture away from one of non-disclosure and often aggressive litigation that is opposed to openness. … It clearly shows that ensuring public access to government information is not a partisan issue, or even one that should divide the branches of government. We hope to see the Senate take up legislation in the near future so that both chambers can work together to send a strong FOIA reform bill to President Obama’s desk for him to sign.

Passage of the House bill is a good first step but only a first step, wrote Anne Weismann, chief counsel of Citizens for Responsibility and Ethics in Washington:

Without a doubt these are needed reforms. As CREW has long advocated, however, meaningful FOIA reform must include changes in the FOIA’s exemptions to make the statute work as Congress intended.  All too often agencies hide behind Exemption 5 and its protection for privileged material to bar public access to documents that would reveal the rationale behind key government decisions.  For example, the Department of Justice denies every request for a legal opinion issued by DOJ’s Office of Legal Counsel that determines what a law means and what conduct it permits, claiming to reveal these opinions would harm the agency’s deliberative process.  This has led to the creation of a body of secret law — precisely what Congress sought to prevent when it enacted the FOIA.

To address this serious problem, CREW has advocated adding a balancing test to Exemption 5 that would require the agency and any reviewing court to balance the government’s claimed need for secrecy against the public interest in disclosure.  Other needed reforms include a requirement that agencies post online all documents disclosed under the FOIA.  The House bill, however, does not incorporate any of these reforms.

This post has been updated with additional statements over time.

FOIA bill in the U.S. House is one of the best opportunities to institutionalize open government

b368b5d878ffa594d2_qtm6bxsz7
U.S. House unanimously voted 410-0 in favor of FOIA reform.

Unless the Congress passes legislation to codify reforms and policies proposed or promulgated under a given administration, the next President of the United States can simply revoke the executive orders and memoranda passed by his or her predecessor.

Today, almost a year after its introduction, the FOIA Oversight and Implementation Act (FOIA), H.R. 1211, will go before the U.S. House for a vote. If enacted*, it would commit the reforms to the Freedom of Information Act that the Obama administration has proposed but go further, placing the burden on agencies to justify withholding information from requestors, codifying the creation of a pilot to enable requestors to submit requests in one place, creating a FOIA Council, and directing federal agencies to automatically publish records responsive to requests online.

While these actions were proposed by the administration in its National Open Government Action Plan, Congressional action would make them permanent.

If it passed both houses of Congress and is signed into law, the FOIA Reform Act would carry into law the spirit of President Barack Obama’s Open Government Memorandum of January 21, 2009 and subsequent Open Government Directive, along with Attorney General Eric Holder’s FOIA memorandum: “The Freedom of Information Act should be administered with a clear presumption: In the face of doubt, openness prevails.”

The bipartisan bill, cosponsored by House Oversight and Government Reform Chairman Darrell Issa (R-CA.), Ranking Member Elijah Cummings (D-MD), and Representative Mike Quigley (D-IL), has received support from every major open government advocacy group in Washington, DC. The released a letter to Congress this week urging the passage of the FOIA Reform Act. The Sunshine in Government and Small Business and Entrepreneurship Council also published letters in support of the bill. It has not, however, picked up a sponsor in the Senate yet.

Currently, 97% of POPVOX users support HR1211. While the bill may not be perfect, very few pieces of legislation are.

“Requests through the Freedom of Information Act remain the principal vehicle through which the American people can access information generated by their government,” said Issa, in a statement last March. “The draft bill is designed to strengthen transparency by ensuring that legislative and executive action to improve FOIA over the past two decades is fully implemented by federal agencies.”

“This bill strengthens FOIA, our most important open government law, and makes clear that the government should operate with a presumption of openness and not one of secrecy,” said Cummings, in a statement.

Given the continued importance of the Freedom of Information Act to journalists and its relevance to holding the federal government accountable, I would urge any readers to find your Representative in Congress and urge him or her to vote for passage of the bill. Improving open government oversight through FOIA reform has been a long time coming, but change should come.

[Image Credit: CREW]

RankAndFiled.com is like the SEC’s EDGAR database, but for humans

A new website, Rank and Filed, gathers data from the Security and Exchange Commission’s EDGAR database, indexes it, and publishes it online in open formats that  investors can use to research and discover companies. I’ve included a screenshot of Tesla’s SEC filings below.

tesla-rank-filed

The site currently has over 25 million files indexed.

I heard about the new website directly from its creator, Maris Jensen, a former SEC analyst who built the site independently. According to Maris, she proposed the project internally in March 2013 but was immediately turned down.

A month later, after she was terminated for threatening the Commission’s mission with a “lack of respect for senior management” — an issue she holds was unrelated to the proposal — Maris decided to make the idea become real independently and started building. She has since offered to give the site and its code to the SEC but has not heard back from them yet.

Our interview, lightly edited for content and clarity, follows.

20140219-201203.jpg

Where did the idea for this originate?

The breaking point was realizing that the guy in the cubicle across from me had spent a week writing the same parser as me — a Python program to parse the EDGAR FTP index for specific filings. This is nearly two decades after Carl Malamud set everything up; the FTP index is exactly as he left it. We were in the division responsible for the SEC’s data analytics and interactive data initiatives. The division literally rewrites this program each time they need SEC filings data. There’s no version control. There’s just no excuse!  Hilariously, that guy also left the SEC and built an SEC filings website, though his is for-profit: http://legalai.com/

What does this do that the SEC needed?

In 2008, the SEC set up a task force (the ‘21st Century Disclosure Initiative‘) to rethink the way they were making data available to the public. A year later, they published this report, with their conclusion and proposal for a new, modernized disclosure system.  I basically just tried to build the system they described. I also did lots of googling — ‘SEC EDGAR tool terrible‘, ‘how to find SEC data‘, etc — and then tried to address the problems people were having.

The problems have been the same for decades. In 1994, people wanted a SEC CIK-to-ticker mapping. 20 years later, this question still pops up on forums monthly.

There are over 600 different forms on EDGAR but the SEC’s form lists are basically no help at all. I went through and googled each form individually. I tried to group them into understandable categories.

The comment at the bottom of this post describes the SEC’s current problem better than I ever could:

Has anyone out there ever tried to use SEC.GOV to search for information about a company? The problem is very easy to articulate. If you search for something, you get 5000 results. At about 10 results per page, you have 500 pages to sift through to find what you want. Once you find what you want, there is ZERO ability to navigate from what you found into related documents!

What if you want to research a particular company’s board of directors? What other companies is each director associated with? Have there been any problems in any of those companies? You can’t investigate these types of things using the technology sec.gov has fielded. You want a needle. The SEC gives you a haystack.

Why not allow for better discovery of all of the SEC data and let investors perform their own investigations of markets & companies?

So instead of focusing on this obvious improvement to the public service the SEC provides, the emphasis apparently is on improving investigative actions. Great. Why not just shut off the sec.gov website completely and let the SEC do all of the investigating and researching of SEC data?

How does RankAndFiled.com compare to other sources of SEC data online?

I unfortunately haven’t added that much ‘value’ yet. I’m a total amateur. I’m just trying to make the data available and understandable! The website doesn’t do any analysis: it just collects, links and presents data from different SEC filings.

Looks like you got some great help from the folks you thanked. Did you build this all yourself with these tools?

Yes, open source tools these days are amazing!!  I started this project with no web or software development experience at all.

I actually feel really lucky to have fallen into all of this. Everything I know I learned on google, mostly through tutorials written by the developers listed there.

I also didn’t know anyone in the dataviz or open source community, so I reached out to some of them with stuff like etiquette questions. Their response and support was just incredible — especially the D3 community, they’re just wonderful.

Can you tell me more about where the data on this site comes from and what you’ve done to it?

Basically, the system watches the SEC’s RSS feeds. It reads and indexes data from SEC filings as they come in. Not all the filings show up on the feeds — I’m not sure why — so it also scans the FTP index for any missed filings.

About 25 million SEC documents have been parsed and incorporated so far, which is everything that’s publicly available on EDGAR.  So companies and people are tracked and connected over time — who’s raising money where, who owns whom, who moved companies or got promoted, who sold a ton of shares.  I also realign all the financial data from quarterly and annual reports so you can see a company’s financial history and so the data is comparable between companies.

It actually feels silly even talking about it, because it’s just so basic. This is stuff the SEC should have been doing years and years and years ago.

But its not a perfect science because one, only a few SEC forms are machine-readable and two, the SEC doesn’t even try to standardize names. SEC registrants are given distinct identifiers but anything goes when companies or names are listed inside a filing. Middle names, middle initials, nicknames, suffixes, titles…

What’s next?

I spent November and December trying to give all my code to the SEC. I received no response, not even a polite no. That’s still the goal — I want them to take over and open source it, or at the very least host the underlying API.  It’s their job to make this data available and accessible. They NEED a team over there doing hands-on work with SEC filings, a team struggling to make sense of this data with just the tools available to retail investors, especially now that they’re talking about disclosure reform.  Right now, they have almost no incentive to change things over to structured data — they buy all the structured EDGAR data they need.

The SEC keeps saying that it’s the private sector’s job to build tools like this, not theirs, but in the past 20 years nobody has come up with a really great, really affordable option.  It doesn’t make sense for any of us to even try — I’ve heard that Bloomberg and Thomson Reuters hire legions of Indian professionals to go through each SEC filing by hand.  We just can’t compete.

The SEC will have to make a lot more of their data machine-readable before any ‘disruptive’ innovation can happen, but they won’t do that until they’re forced to (by Congress), unless they have people there who realize how unfair the situation has become.

There are actually a heartbreaking number of SEC employees who also want this to happen, self-described worker bees who’ve reached out to me from personal email to say they’ve been trying to convince their bosses to give this thing a chance.  So far, no luck! I would open source it myself, but unfortunately I can’t afford to host the project indefinitely.

AskThem.io launches to enable citizens to ask public officials anything

badgeToday, the Participatory Politics Foundation launched AskThem.io, a new online tool focused upon structured questions and answers with elected officials.

As David Moore, founder of PPF, put it, AskThem is like a version of the White House’s “We The People” petition platform, but for over 142,000 elected officials nationwide.” 

The platform is an evolution from earlier attempts to ask questions of candidates for public office, like “10 Questions” from Personal Democracy Media, or the myriad online town halls that governors and the White House have been holding for years. 

AskThem enables anyone to pose a question to any elected official or Verified Twitter account. Notably, the cleanly designed Web app uses geolocation to enable users to learn who represents them, in of itself a valuable service.

As with e-petitions, AskThem users can then sign questions they support, voting them up and sharing the questions with their social networks. When a given question hits a preset threshold, the platform delivers the questions to to the public figure and “encourages a public response.”

That last bit is key: there’s no requirement for someone to respond, for the response itself to be substantive, nor for the public figure to act. There’s only the network effect of public pressure to make any of that happen.

After a year of development, Moore was excited to see the platform go live today, noting a number of precedents set in the process.

“I believe we’re the first open-source web app to support geolocation of elected officials, down to the municipal level, from street address,” he said, via email. “And I believe we’re the first to offer access to over 142,000 elected officials through our combined data sources. And I believe we’re the first to incorporate open government data for informed questions of elected officials at every level of government.”

Moore referred to AskThem’s use of the Google Civic Information API, which provides the data for the platform.

AskThem goes online just in time for tomorrow’s day of action against mass surveillance, where over 5,000 websites will try to activate their users to contact their elected representatives in Washington. Whether it gets much use or not will depend on awareness of the new tool.

That could come through use by high-profile early adopters like Chris Hayes (@chrislhayes), of MSNBC’s “All In with Chris Hayes,” or OK Go, the popular band.

Chris_Hayes_AskThem_TOtE_sampleQ

 

At launch,  66 elected officials nationwide have signed on to participate, though more may join if it catches on. In the meantime, you can use AskThem’s handy map to find local elected officials and see a listing of all of the questions to date across the USA — or pose your own.