On Trials, Software and Otherwise
September 12th, 2007 | Published in Car Tools, Data | 9 Comments
So in response to several commenters on my previous post, I went to caspio.com to see about a free 14-day trial in order to test things out. Then I read the Terms of Service, which contains this sentence: “In addition, you may not access the Service for purposes of monitoring its availability, performance or functionality, or for any other benchmarking or competitive purposes.”
So I’m not sure whether I can get that trial, as I have no intention of becoming a customer. I mean, I’m pretty sure I could, given the kind invitation by David Milliron, but the ToS seem to indicate otherwise. Anyway, there are two larger points that deserve more treatment than I gave them in the first post.
The first is that part of the appeal of a service like Caspio is, as several folks have pointed out, the ability to bypass a reluctant or otherwise clueless IT department. This is not insignificant, and in this sense Caspio is providing a way out for some organizations that might otherwise have little or no option when it comes to publishing data online. That is, compared to the alternative, not a bad thing. The potentially troubling aspect is that, having avoided a fight with IT over server access and other issues, newsroom managers may be content to leave the situation at that. Some of the folks who had kind words to say about Caspio also said that it was not a complete solution; hopefully their managers see it that way. I don’t blame people for using Caspio or any other product if they have limited options. It’s the limited options situation that’s a big problem for many news organizations.
The second point, however, is all about a Caspio limitation, one that I mentioned in the comments. The fact is that, for a Web database app, Caspio has a significant problem in that many of its DataPages cannot be found using a search engine. So while an Indianapolis Star database of school suspensions has a page for the Carroll Jr-Sr High School, a Google search for “‘Carroll Jr-Sr High School’ suspension” doesn’t find any pages on IndyStar.com in its results. Now maybe you’ve got a the kind of audience loyalty that means you don’t have to worry about search engines; if so, count yourself extremely lucky. (The Indianapolis example, btw, is not the only one where that happens, so this isn’t to bash the folks there.)
Some of the comments seemed to suggest that other solutions were only within the reach of larger news organizations. If that were true, then the folks at the Lawrence Journal-World, where Django originated, would be surprised to hear it. Other newspapers have built their own tools, too.
Every framework is going to have its problems or limitations; this isn’t to suggest that there is a holy grail. What it is meant to suggest is that when you promise ease of use and a point-and-click Web database publishing experience, there usually are some tradeoffs. Whether those tradeoffs are worth the experience is up to the users to decide, but it’s to the better that people know there are real options out there, and many of them aren’t beyond our capabilities.
September 13th, 2007 at 12:46 am (#)
This is great stuff, Derek. I started to write a comment here riffing on your exploration of Caspio, but it kinda blew up into something large so I posted it over on my blog where it can have more breathing room:
http://www.jacobian.org/writing/2007/sep/12/db-journalism/
September 13th, 2007 at 12:13 pm (#)
Search engines have trouble with dynamic data and widget content data but with acelerated popularity of both, search engines have no choice but to find ways to index such content.
September 13th, 2007 at 1:13 pm (#)
Sam,
I have to disagree with you there. We use lots of dynamic data in several apps at washingtonpost.com, and Google and other search engines index them just fine. It’s not the dynamic nature of data per se, it’s how you handle it within a Web page that matters. I’m not saying that search engines won’t eventually learn how to deal with content loaded from JavaScript calls, but right now those kinds of pages don’t seem to be getting indexed.
September 14th, 2007 at 8:59 am (#)
I actually think Sam is right about this… I think it depends on how you actually post your dynamic data. The Post’s data isn’t really dynamic. It’s cached up the ying-yang, so search engines actually have something to crawl.
For example, do a google advanced search on washingtonpost.com for “Payne Memorial AME Church.” You’ll find the John Edwards event. If you do a search for “wallsmith” filtering just on “site: nytimes.com,” you will not find his entry in our Flash graphic, Faces of the Dead.
I think if you take advantage of client-side technologies (AJAX, Flash), I do think search engines have more trouble. This is the complaint I have heard from our online guys, and that might explain why Caspio sites don’t rank higher in Google. That’s just a guess though, because I have no idea how they do data. I will also admit right up front (or almost up front) that I am no expert in this regard, and may be 100 percent wrong.
OK, now to my real point:
To add to what Derek said, there’s a bigger downside I see to outsourcing your database-driven content (for those who have the choice): You’re also outsourcing the process of discovery as well.
I remember when I learned Arc, suddenly I was thinking about data in an entirely new way. Stories occurred to me that never would have otherwise. Same deal with Perl. Once I learned to script, and then scrape the web, all kinds of projects occurred to me that I couldn’t have imagined before.
Sarah always says, “Give someone a hammer, the world looks like nothing but nails.” (Or something like that…) By outsourcing, you’re giving Caspio the hammer. Personally? I want that hammer.
So, we can talk about whether or not you could have done PolitiFact in Caspio or not, and whether it would have cost more or less money to the organization. But that’s missing the point. The point is, without learning Python and Django, would PolitiFact even have occurred to Matt? Maybe, maybe not.
September 16th, 2007 at 11:05 am (#)
I’m pretty platform agnostic. I’ve been doing online databases for years in ASP, but when people would ask me at CAR conferences what to use to get started, I would always tell them: Use whatever is most practical for your situation (based on how much time they had, how much skill they had, what their IT departments were willing to tolerate, etc.)
Moreover, I speak from experience of having tried Caspio.
I have nothing against Caspio. But I am strongly against outsourcing.
Just as staff produced content distinguishes your news organizations from wire content, you can do much, much, much more with online data DIY then using a template tool.
A template tool allows you to provide simple look-ups. But your users will expect more — they’ll want to rank, sort, link results to other databases, etc.
In other words, if you know how to do it yourself, you can bring a degree of intelligence to the project that you just can’t get with simple look-ups built by a template.
So to me, if online databases are worth doing — and they are, and we’re competing with non-journalism sites already in this field — news organizations need to commit to a database platform and commit to training somebody who can build a bridge betweeen databases and all the other content.
September 28th, 2007 at 5:38 pm (#)
Hear, hear. Preaching to the choir, perhaps, but the real solution is simple: Hire developers. Or pay to turn one of your geek-minded reporters into one.
What? We can’t do that in an era of diminishing resources?
Hmm. When TV got big in the ’60s, we hired TV critics. When business news became important in the ’80s, we ponied up for larger (and better trained) business staffs.
No insults intended to any past or current colleagues - but the utility of most local TV, book and movie critics is gone (unless they’re named “Ebert.”) Travel editors? “National” sports-beat writers who don’t cover the local team anymore? Same story, however sad it may be.
Yes, diminished resources are real - but there are still many, many jobs in our newsrooms whose original purpose has been overwhelmed by the digital revolution.
Bosses, that’s where you get the FTEs to hire your coders.
October 16th, 2007 at 8:23 am (#)
[...] The Scoop » Blog Archive » On Trials, Software and Otherwise (tags: database innovation journalism tools) [...]
October 17th, 2007 at 1:55 am (#)
As someone who’s still pretty new to database development, I have to agree with Derek’s observation that one of the most valuable aspects of building database applications in-house is the learning process.
I used both a trial version of Caspio, and my own custom database and interface, for my still-under-development Ultimate Sacrifice database gallery. Although Caspio allowed me to quickly publish something online, it did not allow me to really learn anything more about database programming. Yes it’s very time-consuming to build my own homegrown application (because I’m still so new at it), but I am gaining such a great understanding of database and web development — which will make future projects fly more quickly.
November 14th, 2007 at 7:44 pm (#)
[...] Â http://www.thescoop.org/archives/2007/09/12/on-trials-software-and-otherwise/Â Â Â [...]