No, Really, Show Us The Data
March 25th, 2009 | Published in FOIA, Fed Data | 9 Comments
When it first appeared I was really excited to see Show Us The Data, which gave visitors a chance to list and vote for their “Most Requested Documents” that should be more readily available from the federal government. Sure enough, there were plenty of strong choices for the top 10 list. And then people starting voting, and the results were not quite what I had hoped to see. Yes, the items that comprise the Top 10 List (irony alert! it’s a PDF) are worthy documents, but some of them (the Supreme Court website?) reflect a lack of familiarity with the government information that’s truly buried.
What follows is my entirely subjective, data-heavy and document-light version. It’s Congress-heavy, because the executive branch has done much, much better in many ways. No, really, show us the data:
- Congressional committee votes. As far as I know, only commercial companies like CQ possess this information in the aggregate. Most committees publish them in committee reports (House Judiciary is one of the better ones) without a standard format and in such a way as to make their gathering prohibitively expensive. And yet these are some of the most telling public actions lawmakers make.
- Earmarks. If you don’t think the Appropriations Committees have a database of earmarks, you’re naive. Of course they do – it’s valuable information. Now, about sharing it in anything but an image PDF format… well, let’s just say that Keith Ashdown and the folks at Taxpayers for Common Sense probably aren’t going away soon.
- Foreign Travel Reports (Codels). The House publishes PDFs and text files of this data, but they are formatted for reading, not analysis. It would not be hard to change this.
- Legal Defense Funds. It’s utterly ridiculous that while House members now file their campaign reports electronically, legal defense fund reports are still filed on paper. This is a no-brainer.
Senate Votes in XML. Go ahead, view source on this page. See where the HTML comment says “****** vote_111_1_00110.xml … “? They already generate these files; but the public can’t have them. They’re only for the use of Senators. There’s absolutely no reason the Senate cannot join the House in doing this, so why won’t they? Update: they have!- Senior Executive Service. This one is particularly egregious, in that the information on senior-level political appointees in the executive branch previously was made available in database-friendly formats, but now is only available via PDF. So OPM chose to make the information less useful.
- High-Level Diplomatic Visits. Another “I can’t believe it’s not a database” entry. The State Department offers a list of visits by foreign leaders and lists of visits by the president and secretary.
- The CIA World Factbook. Oh, you can download the PDF, but (and I am not making this up): “the search software resides on our server and cannot be distributed with the World Factbook.” Thanks!
That’s eight, and I can already think of some more. What’s on your list? Actual federal data, please, as opposed to documents that are valuable for their full-text content. I’m sure I’m missing some that should be on here.
March 25th, 2009 at 11:20 pm (#)
[...] Willis, with whom I was lucky enough to work with at the Center for Public Integrity, critiques the recent Show Us The Data effort (full disclosure: Sunlight was involved in building the site, [...]
April 3rd, 2009 at 11:37 am (#)
[...] Why isn’t this data public: Earmarks. If you don’t think the Appropriations Committees have a database of earmarks, you’re [...]
April 3rd, 2009 at 12:30 pm (#)
How about fixing all the noise on the Senate Office of Public Records site that is supposed to work but doesn’t. Amendments!!! My freaking kingdom for amendments.
April 4th, 2009 at 9:28 pm (#)
I wrote some quick and dirty code to grab the diplomatic visit lists off the State Department’s website and dump them to CSV files: http://github.com/mindleak/state_visits/tree/master
This led to the interesting (but not surprising) chart at http://mindleak.com/visits.png
April 4th, 2009 at 9:38 pm (#)
Michael,
That’s awesome, thanks! And yeah, I bet there are good insights from this information.
May 5th, 2009 at 4:55 pm (#)
Great list. I’d love to be able to get Supreme Court decisions programatically too–or at least an RSS feed for notifications of when new decisions were posted.
May 11th, 2009 at 11:06 am (#)
[...] reste-t-il à ouvrir ? Aux Etats-Unis, vous pouvez jeter un oeil à une liste de huit sources de données gouvernementales à ouvrir sans délais selon Willis du New York Times. En France, tout reste à faire et la situation a peu évolué [...]
May 13th, 2009 at 4:59 pm (#)
[...] No, Really, Show Us The Data :: The Scoop – A list of data, which the US government should make more easily available. [...]
February 2nd, 2010 at 2:38 pm (#)
Re the CIA World Factbook.
“The Factbook Web site now features Country Comparison pages for selected Factbook entries. All of the Country Comparison pages can be downloaded as tab-delimited data files that can be opened in other applications such as spreadsheets and databases.”
Also, you can scrape the data off the site for things not on the comparison pages with your favorite scraping-to-data tool.