Using the NYT Congress API with … Excel?
May 11th, 2010 | Published in Car Tools, XML | 5 Comments
It’s true that Excel has been a decreasing part of my toolkit for several years now, and that I never quite had the love for it that I do for various database managers. But I’m guessing that’s the exception, not the rule, in the broader journalism community. So when it came time to propose a lightning talk for the 2010 CAR Conference last week, I chose to pull out the ol’ spreadsheet and show how you could get started with the NYT’s Congress API with a familiar tool.
To do this, I had to not only drag out Excel but also do it on Windows, since Excel’s Web Query feature isn’t available on the Mac. (You could also do this, albeit in a slightly different manner, using OpenOffice and Google Spreadsheets. In the comments, Chris Amico shows you how using Google Spreadsheets.) Here’s how it works using Excel.
First, you’ll need an API key. To get one, go to The Times Developer Network and register (note: you’ll need to be a registered user of nytimes.com first).
You’re registering an “application”, and then you can add specific API keys to that account. Let’s add one for the Congress API. The key itself is a longish string of letters and numbers that gets appended to every API request URL, including the ones we’ll make from Excel. Let’s copy the API key so we can easily grab it (note that this particular key has been disabled, so using it won’t work).
Let’s find an API call that we can use be looking at the Congress API’s documentation. Let’s pick the “members leaving office” response, otherwise known as the casualty list. All that’s required is the chamber (‘house’ or ‘senate’) and the congress (currently only the 111th is supported). If we choose the House, the URL will look like this, except that you’ll need to specify your Congress API Key.
The version number should be “v3″ and you don’t need to specify a format after leaving (xml is the default). You should quickly get an xml file that looks roughly like this:
To get that xml into Excel, we’re going to use Excel’s Import Data feature. I’m not one of those cool kids who has Excel 2007 at their fingertips, so I’m going to use Excel 2002. Import Data can be found at Data -> Get External Data -> Import Data.
Then change the file type to xml and paste the full API url into the box just above the file type.
It works for local files and Web urls. Then click on “Open” to start the process. The import process consists of Excel asking you where to put the file. Just click “OK” and you should soon see something like this:
The header row in row 2 isn’t perfect, but it should suffice. You probably don’t need the copyright statement in column A. But now you’ve got a way to pull data into Excel from an API! If you have questions or comments, please don’t hesitate to post them below. If you’re having issues with the API, the forum is the best place to head.






May 11th, 2010 at 9:39 am (#)
No matter which new skills I learn, Excel still seems to be a permanent and significant part of my data landscape. I remember this exercise from CAR 2010 (was it just last week?
) and am glad you’re showing it here, because Excel remains a key point of entry for beginning data journalists. I recently dove a little deeper into Excel 2007, and it pushes this retrieval option as a prominent menu item. I suspect that will lead more people to use it to get data to analyze. Nice stuff!
May 11th, 2010 at 11:11 am (#)
After playing around with this for a few minutes, I have good news: You can do this without Windows.
Google Docs will parse XML, though it’s a little trickier. The function is almost the same,
=ImportXML(uri, query). The one thing that took some meddling is the query, which should be XPath. But if you have Firebug, you can Inspect Element > Copy XPath. I had to try a couple queries before I got it right. In this case, “/result_set/results/members/member”. It also doesn’t add in column headers, so I did that by hand.Here’s the result, using the same data as above: http://spreadsheets.google.com/pub?key=txksAGjkiApZ3s2gV6Y-7Kg&output=html
October 2nd, 2010 at 7:05 am (#)
[...] if you want to get serious about mashing up, you will need to explore the world of programming and APIs. At that point you may sit back and think: “Data journalism is [...]
February 1st, 2011 at 2:24 am (#)
[...] and if you want to avoid the programming part altogether, my mentor Derek Willis can help you bring it directly into Excel. (How many entries do we think I got through without citing Derek? Not many. I’ll try again [...]
October 15th, 2011 at 10:47 pm (#)
I am one that still loves Excel, even though I have mostly moves on to the web world, I do love to get back into Excel when I can and to write some code. Hopefully I can do more of that in a few years when things slow down. Great comments and a great post.
Christopher