The Scoop

  • Home
  • Projects
  • About The Scoop
  • Fixing Journalism
  • Departments
    • Apple
    • Asides
    • Broadcast
    • Campaign Finance
    • Car Tools
    • Data
    • DIY
    • django
    • Fed Data
    • FOIA
    • General
    • IRE
    • Journalism
    • Local Data
    • Mapping
    • Miscellany
    • NonGov Data
    • Online
    • Paper Trail
    • Presentations
    • Public Records
    • Python
    • Rails
    • SLA
    • Social Network Analysis
    • Sports
    • State Data
    • Teaching
    • Work
    • XML
  • Subscribe via RSS

The Birth of Quadruplets, or Understanding the Process

July 22nd, 2008  |  Published in Journalism | Comments (0)

My friend Dave Gulliver had a fascinating piece in his paper on Sunday about the birth of quadruplets in a Sarasota hospital. It’s a great story, but what makes it greater is that it was written by somebody with a certain amount of expertise on the subject of difficult premature multiple births. I hope Dave doesn’t mind, but I’d like to use that story as an example of why understanding the use of data is increasingly important for large swaths of journalism.

There’s a tendency among some folks in the industry to see CAR and other technological tools as just that - blunt instruments. Helpful, sure, but not ultimately necessary to the task of creating journalism. And for a segment of what journalism does, that’s probably ok. When we report on people and institutions that aren’t using technology to guide their decisions or actions, then an understanding of how data is used or certain technologies isn’t a necessity.

I suppose a music critic needn’t understand much about databases, for example, but reporters covering government, business, college or professional sports, to name a few, should be able to assess their subjects the way that people inside those sectors do. And increasingly, that means understanding the use of data. Many local governments base their police staffing - who covers where - on a non-stop flow of crime data. Sports teams pour over tape, logging their opponents’ tendencies in preparation for upcoming games. Businesses are all about the numbers, too.

And then there’s politics. Winning elections these days is very often about putting together enough voters to crack 50%. There’s microtargeting based on consumer data and door-to-door canvassing so that volunteers can input demographic data into centralized servers. They’re not doing that just for fun - it’s valuable information. But if journalists can’t really grasp how organizations are using data, we’re liable to miss the effects, and thus miss some fuller explanations of events. Yes, we can rely on people to tell us what’s happening - and we should - but if data plays a big part in the life of an organization, the reporter covering it should have some basis to evaluate that role.

So how does that relate to Dave’s story about the quads? Well, after reading it, I noticed that there were some subtle bits of detail that I never would have thought to include or been able to describe as well - about how the NICU operates, the details of the births. That’s because Dave has been there with his twin boys. A parent of a child born without complications or a single person would have been hard-pressed to write as good a story. I sure wouldn’t have been able to do so.

It’s the same idea when it comes to understanding the basis for decisions that come from, at least in part, the collection and consumption of data. It’s can mean the difference between telling a story and telling a better story. I’m sure plenty of organizations that we cover would be happy to have reporters who are in the dark about these things. But that doesn’t help our readers any.

So, technology and data as a tool? Yes. But when the tools become a crucial part of the world we cover, understanding how they work and being able to use them makes us better journalists.

DjangoCon

July 20th, 2008  |  Published in django | Comments (0)

I donated to Django

The first-ever DjangoCon will be held Sept. 6-7 at the Googleplex in Mountain View, Calif. The preliminary program looks incredible, and I’m sad to be missing it. My summer travels have been plenty and another West Coast trip, especially over a weekend, is a bit too much (there’s also the nagging point that I’d have to pay for it myself!). Matt Waite will be there, on a panel discussing Django in journalism, just one of the really strong sessions. If you’re a West Coast CAR person dabbling in frameworks, it’s worth checking out.

But I’m trying to do my part, beginning with a donation to the Django Software Foundation. Doing so will help pay for conferences like DjangoCon, sprints and other activities that help improve the framework, and it’s such a small thing to do considering the benefits I’ve realized from using Django. If you feel the same way, please think about supporting Django.

Caspio’s Lessons

June 29th, 2008  |  Published in Car Tools, Journalism | Comments (6)

Been awhile since I wrote about Caspio, and since then they’ve only gained more media clients, which I suppose could be a lesson for me. But I think not. Rather, I hope what we’ll see in the months and years to come are the lessons that Matt Wynn offers from his experiences using Caspio. Here’s your nutgraf: “My conclusion on Caspio is that they do one thing very well. But other, cheaper alternatives do it just as well. Further, to learn to make it do otherwise seems pointless, especially seeing as we would be paying for the luxury of learning to hack it.” (The emphasis is mine.)

Caspio’s David Milliron spoke at this year’s Special Libraries Association conference at a panel organized by SLA’s News Division, which includes many newspaper and broadcast librarians. It’s easy to see why: a lot of these folks are being asked to do new things, to be more involved with their organization’s Web sites, and to do it with fewer people. Seems like a pretty good opportunity for Caspio, and I don’t fault them for recognizing that. The problem I have is that the promise of Caspio is in the short-term; no matter how many features they add (my personal favorite being the Data Sheet Find and Replace one: “You no longer need to export your table outside of Caspio Bridge for this type data modification.”), you’ll never get the flexibility and control over your apps that you do when you build your own stuff. Despite what Milliron says, there are very real and serious differences between Caspio and Web application frameworks.

Maybe that’s the real lesson that journalism folks need to heed: that the costs of learning Caspio go beyond the monthly fees and the potential cost of switching to another tool (having to re-do your existing apps). Caspio is, as Matt says, good at doing some pretty basic stuff when it comes to putting data online. But if you want to go beyond Ye Olde Data Ghettoe, you’ll have to learn some programming anyway. So why learn something that can only be used on a closed system that you have to rent? Matt’s alternative happens to be PHP/MySQL based, but he’s not going to be paying for using either of those. And if suddenly MySQL decides to charge corporate users or something equally far-fetched, he can switch to Postgres or SQLite without starting from scratch.

I realize many, many folks in newsrooms can say, “Um, pardon me, but we don’t have a Matt Wynn.” Or maybe you do, but he’s insanely busy all the time. That’s a very common situation. But the real long-term question is this: if your organization is never going to want to do anything more than put up isolated search pages serving up content that no search engine can reliably find, you’re still gonna pay every month for that privilege by using Caspio. And if you hope and plan on doing more someday, even if that’s not today, then you’ll have almost nothing to transfer to that effort by using Caspio, since one of their chief claims is that you don’t have to learn any programming to use it.

So if learning more is a part of your plan, why not spend the time learning a system that doesn’t charge you for that time? By adding Caspio experience to your resume, what real skills have you gained aside from the ability to point and click?

The Future of News Libraries

June 19th, 2008  |  Published in Journalism, SLA | Comments (2)

At the recently-completed SLA conference in Seattle, Nora Paul led a session on the “future of news libraries” that asked the attendees to imagine 2012, when librarians (or news researchers, or whatever you want to call them) are recognized as leaders of innovation in newsrooms, and then to explain how that came to pass. It was an ambitious and worthy session, and I’d like to see more of them among the News Division crowd. But to be honest, some of the answers worried me. I didn’t see the future unfolding the way we’d all like when I heard some of the responses to Nora’s questions.

The problem isn’t the people. I go to the SLA conference even though it has a tangential relationship (at best) to my current job because that’s where I find a group of really smart people that span nearly every field: news, law, government, engineering, technology, health, you name it. The session topics are eclectic, interesting and well-attended. People don’t wander off much.

But the problem isn’t the business climate for news, either. At least not totally. It’s a complex situation, in which a combination of factors keep a lot of news librarians anxious about their jobs and their futures. There’s a whole new set of content to archive, dwindling staff resources to deal with and a main set of consumers - reporters and editors - who remain by and large too ill-informed about the best way to find and manage needed information.

So there’s a tendency to think that if news libraries continue to provide what newsrooms want, and do it well, that things will be ok. Yes, duties will change, and people will adapt, but fundamentally it’s pretty much the same goal: finding information and turning it into knowledge.

Except that now we as news organizations are competing with, well, pretty much anybody who wants to be a guide to information, and on multiple fronts. And while most of these new competitors should also be our consumers, they are not bound by some of the ideas that have shaped how libraries have worked.

The first, and most important in my mind, is freeing ourselves from near-total dependence on vendors. Vendors will always have a role, as there are some non-core tasks that we should rightly hand to them, and some core resources that only they can provide. But I cannot imagine how news libraries will becoming engines of innovation if they do not control, or seek to control, the tools of their trade. They can no more outsource the future than the rest of the paper can, and it’s time to consider how news libraries can produce both better tools that can lead to better products.

During a session that Jessica Baumgart and I had the privilege of speaking at on Monday, I got a question from a woman in the audience who wanted to know how it was a good thing for the industry if news librarians picked up programming skills and then got better, higher-paying jobs in IT. I answered that such an occurrence was the industry’s problem, but this is what I should have said: I encourage librarians to develop these skills precisely so that they will be better librarians, better researchers. So that they can better and more efficiently manage the ever-increasing flow of information. So that they can take control of their own futures in a way that they will not be able to do without those changes. Some will leave the news library; I did. But if we stay in the news business, we’ll still be interested in solving the problems of the newsroom. And we’ll be able to contribute in even more meaningful ways (and fail sometimes along the way).

I wrote in November 2005 about ways that news libraries could actually become innovation centers, but I didn’t go far enough. Newsrooms desperately need people who can generate and then execute new ideas - for improvements in the news process and for products, to name two areas. I heard at SLA about papers forming employee committees to discuss and propose new ideas, which is great, but what if the library was the place where you could prototype and even build out some of those ideas that involve the existing content of the paper? Those committees aren’t going to be permanent, and most news organizations can’t afford to hire an R&D department.

News librarians regularly solve problems that vex 30-year veterans of the newsroom, and they often are the source of last resort for the people who need good, accurate information. But until they can be less reactive and until they start developing their own tools, getting to Nora’s 2012 scenario will be tough.

So let’s get started. SLA has created an innovation section on its site, and it’s worth taking the time to explore. Or try this effort by a fellow librarian and developer, Daniel Chudnov of the Library of Congress. Try Python, or Ruby or another programming language. Think of a concrete task - maybe some repetitive work you have to do that you’d like to automate - and see if you can’t solve it. Maybe not the first time out, or the second, but failure is a part of innovation. Not trying is the only real failure.

SLA Wrap-Up

June 18th, 2008  |  Published in SLA | Comments (1)

This year’s Special Libraries Association conference was, as usual, a great experience. Lots of good sessions from the News Division and other divisions. Some of my highlights, in dump mode:

  • A session on controlled vocabularies in art museums featuring folks from the Getty in Los Angeles. Turns out they have several datasets that might be interesting to check out.
  • Lots of emphasis on digital preservation and archiving, including a session on the MetaArchive Cooperative. Seems like a similar effort should be underway for smaller newspapers who may not be fully capable of providing a failsafe repository for their content. Maybe something that an institution like Missouri or Stanford (which develops the software used for this) could take the lead on.
  • A good intro on visualization techniques from Dennie Heye, and I hope that more such sessions are on the table for next year’s conference.
  • Things wrapped up with an interesting session on the future of news libraries, which deserves a lot more discussion and dissection from the News Division. I’ll have more to say on this later, rest assured.

Previously


Jul 20, 2008
DjangoCon

by Derek | Read | No Comments

The first-ever DjangoCon will be held Sept. 6-7 at the Googleplex in Mountain View, Calif. The preliminary program looks incredible, and I’m sad to be missing it. My summer travels have been plenty and another West Coast trip, especially over a weekend, is a bit too much (there’s also the nagging point that I’d have [...]


Jun 29, 2008
Caspio’s Lessons

by Derek | Read | 6 Comments

Been awhile since I wrote about Caspio, and since then they’ve only gained more media clients, which I suppose could be a lesson for me. But I think not. Rather, I hope what we’ll see in the months and years to come are the lessons that Matt Wynn offers from his experiences using Caspio. Here’s [...]


Jun 19, 2008
The Future of News Libraries

by Derek | Read | 2 Comments

At the recently-completed SLA conference in Seattle, Nora Paul led a session on the “future of news libraries” that asked the attendees to imagine 2012, when librarians (or news researchers, or whatever you want to call them) are recognized as leaders of innovation in newsrooms, and then to explain how that came to pass. It [...]


Jun 18, 2008
SLA Wrap-Up

by Derek | Read | 1 Comment

This year’s Special Libraries Association conference was, as usual, a great experience. Lots of good sessions from the News Division and other divisions. Some of my highlights, in dump mode:

A session on controlled vocabularies in art museums featuring folks from the Getty in Los Angeles. Turns out they have several datasets that might be interesting [...]


Jun 11, 2008
Eight Years and Counting

by Derek | Read | 4 Comments

Eight years ago today I turned on the blog portion of this site. The changes have been many, from platforms (Blogger-GreyMatter-MovableType-Wordpress) to look (various styles borrowed from people actually competent at design) and content. I’ve tried to narrow the focus, but turns out my interests keep changing. So it goes. Thanks for coming along for [...]


Jun 9, 2008
The Choice(s)

by Derek | Read | 3 Comments

A couple of folks have asked me lately whether, having first worked with Django and now with Rails, I would recommend one over the other. I resisted the impulse because, first, I’m an expert in neither framework and second, because the answer really is: it depends. So, that about wraps it up, eh?
Actually, I think [...]

About The Scoop

Derek Willis’ weblog on investigative and computer-assisted reporting.

Recent Comments

  • Derek on Ramsey County Judicial Election
  • John Guzik on Ramsey County Judicial Election
  • John Zhu on On Bomb-Throwing
  • Benj. on Caspio’s Lessons
  • palewire / Permalinks, low-rent data viz and other stupid Caspio tricks. on Trial By Caspio

Recent Posts

  • The Birth of Quadruplets, or Understanding the Process
  • DjangoCon
  • Caspio’s Lessons
  • The Future of News Libraries
  • SLA Wrap-Up

Contributors

  • Derek
  • Matt

Popular

  • Methadone Overdose Deaths
  • The Times
  • On Bomb-Throwing
  • Outsourcing Database Development, or the Caspio Issue
  • Trial By Caspio
  • Joyce Meyer Ministry Compensation
  • The Original (and Future?) Facebook
  • Django, iCal and vObject
  • Teaching Data on the Web
  • EveryBlock and the Definition of News
  • Around the Site

    • Home
    • About
    • Projects
    • Fixing Journalism
    • Database of CAR Stories
  • Methods

    • Fanueil Media
    • Open
    • Institute for Analytic Journalism
    • CAR in Canada
    • IRE
    • MacDevCenter
    • ONLamp.com
    • Planet MySQL
    • Poynter
    • Resource Shelf
  • People

    • Mark Schaver
    • Jeremy Zawodny
    • Liz Donovan
    • Shannan Bowen
    • Matt Wynn
    • Chase Davis
    • Adrian Holovaty
    • Joe Adams
    • Matt Waite
    • Mike Hillyer
    • Mark Hamilton
    • William P. Hartnett


  • ©2008 The Scoop
    Powered by WordPress using the Gridline Lite theme by Graph Paper Press.