The Scoop

  • Home
  • Projects
  • About The Scoop
  • Fixing Journalism
  • Medill Links
  • Departments
    • API
    • Apple
    • Asides
    • Broadcast
    • Campaign Finance
    • Car Tools
    • Code
    • Data
    • DIY
    • django
    • Fed Data
    • FOIA
    • General
    • IRE
    • Journalism
    • Local Data
    • Mapping
    • Miscellany
    • NonGov Data
    • Online
    • Paper Trail
    • Presentations
    • Public Records
    • Python
    • Rails
    • Ruby
    • SLA
    • Social Network Analysis
    • Sports
    • State Data
    • Teaching
    • Work
    • XML
  • Subscribe via RSS

Using the NYT Congress API with … Excel?

May 11th, 2010  |  Published in Car Tools, XML  |  5 Comments

It’s true that Excel has been a decreasing part of my toolkit for several years now, and that I never quite had the love for it that I do for various database managers. But I’m guessing that’s the exception, not the rule, in the broader journalism community. So when it came time to propose a lightning talk for the 2010 CAR Conference last week, I chose to pull out the ol’ spreadsheet and show how you could get started with the NYT’s Congress API with a familiar tool.

To do this, I had to not only drag out Excel but also do it on Windows, since Excel’s Web Query feature isn’t available on the Mac. (You could also do this, albeit in a slightly different manner, using OpenOffice and Google Spreadsheets. In the comments, Chris Amico shows you how using Google Spreadsheets.) Here’s how it works using Excel.

First, you’ll need an API key. To get one, go to The Times Developer Network and register (note: you’ll need to be a registered user of nytimes.com first).

You’re registering an “application”, and then you can add specific API keys to that account. Let’s add one for the Congress API. The key itself is a longish string of letters and numbers that gets appended to every API request URL, including the ones we’ll make from Excel. Let’s copy the API key so we can easily grab it (note that this particular key has been disabled, so using it won’t work).

Let’s find an API call that we can use be looking at the Congress API’s documentation. Let’s pick the “members leaving office” response, otherwise known as the casualty list. All that’s required is the chamber (‘house’ or ‘senate’) and the congress (currently only the 111th is supported). If we choose the House, the URL will look like this, except that you’ll need to specify your Congress API Key.

The version number should be “v3″ and you don’t need to specify a format after leaving (xml is the default). You should quickly get an xml file that looks roughly like this:

To get that xml into Excel, we’re going to use Excel’s Import Data feature. I’m not one of those cool kids who has Excel 2007 at their fingertips, so I’m going to use Excel 2002. Import Data can be found at Data -> Get External Data -> Import Data.

Then change the file type to xml and paste the full API url into the box just above the file type.

It works for local files and Web urls. Then click on “Open” to start the process. The import process consists of Excel asking you where to put the file. Just click “OK” and you should soon see something like this:

The header row in row 2 isn’t perfect, but it should suffice. You probably don’t need the copyright statement in column A. But now you’ve got a way to pull data into Excel from an API! If you have questions or comments, please don’t hesitate to post them below. If you’re having issues with the API, the forum is the best place to head.

Responses

Feed Trackback Address
  1. Anthony DeBarros says:

    May 11th, 2010 at 9:39 am (#)

    No matter which new skills I learn, Excel still seems to be a permanent and significant part of my data landscape. I remember this exercise from CAR 2010 (was it just last week? :-) ) and am glad you’re showing it here, because Excel remains a key point of entry for beginning data journalists. I recently dove a little deeper into Excel 2007, and it pushes this retrieval option as a prominent menu item. I suspect that will lead more people to use it to get data to analyze. Nice stuff!

  2. Chris Amico says:

    May 11th, 2010 at 11:11 am (#)

    After playing around with this for a few minutes, I have good news: You can do this without Windows.

    Google Docs will parse XML, though it’s a little trickier. The function is almost the same, =ImportXML(uri, query). The one thing that took some meddling is the query, which should be XPath. But if you have Firebug, you can Inspect Element > Copy XPath. I had to try a couple queries before I got it right. In this case, “/result_set/results/members/member”. It also doesn’t add in column headers, so I did that by hand.

    Here’s the result, using the same data as above: http://spreadsheets.google.com/pub?key=txksAGjkiApZ3s2gV6Y-7Kg&output=html

  3. How to be a data journalist | SCOT NETWORK says:

    October 2nd, 2010 at 7:05 am (#)

    [...] if you want to get serious about mashing up, you will need to explore the world of programming and APIs. At that point you may sit back and think: “Data journalism is [...]

  4. Making the structured usable: Transform JSON into a CSV « Michelle Minkoff says:

    February 1st, 2011 at 2:24 am (#)

    [...] and if you want to avoid the programming part altogether, my mentor Derek Willis can help you bring it directly into Excel. (How many entries do we think I got through without citing Derek? Not many. I’ll try again [...]

  5. Christopher says:

    October 15th, 2011 at 10:47 pm (#)

    I am one that still loves Excel, even though I have mostly moves on to the web world, I do love to get back into Excel when I can and to write some code. Hopefully I can do more of that in a few years when things slow down. Great comments and a great post.

    Christopher

Leave a Response

Recent Comments

  • Seth Lewis on Lost in the Weeds
  • Reporters' Lab // News algorithms already exist – and that’s good on The Programmer-Reporter
  • Eric Mill on On Legislative Data Transparency
  • (19:19 06-02-2012) Noticias más populares de #opengov en las ultimas 24 horas | Tuits de Software Libre on On Legislative Data Transparency
  • (15:05 06-02-2012) Noticias más populares de #opengov en las ultimas 24 horas | Tuits de Software Libre on On Legislative Data Transparency

Recent Posts

  • Lost in the Weeds
  • Our Mark Knoller Problem
  • The Programmer-Reporter
  • Investigating House Freshmen Voting Patterns
  • On Legislative Data Transparency

Linking Out

  • Mapping America — Census Bureau 2005-9 American Community Survey - NYTimes.com
    holy crap
  • Backbone.js and Django | joshbohde.com
  • ProPublica
  • Geoff: GeoJSON Feature Functions for JavaScript
  • Introducing Spanner: From Documents to Linked Data Apps—Clark & Parsia: Thinking Clearly
  • A performance lesson on Django QuerySets | Seek Nuance
  • http://www.post-gazette.com/pg/03001/1108747-209.stm
  • CBC News - Canada - Database: Canadian cables in WikiLeaks
  • Federal prosecutors likely to keep jobs after cases collapse - USATODAY.com
  • Strata Gems: Explore and visualize graphs with Gephi - O'Reilly Radar


©2012 The Scoop
Powered by WordPress using the Gridline Lite theme by Graph Paper Press.