Peter Dawson posted a comment on my previous post noting that Digg has documentation for a pending API.

The Digg API (at the time this article was written) has not yet been released. This API allows developers to tap into the core power of Digg.com and integrate their own services with it.
This information is provided without warranty. That means, if it doesn’t work, don’t email me about it.

I hope they’re still open to feedback at this point because I have some concerns (which I’m sure Dave will mirror).

Specifically, why didn’t they just use RSS via REST? They’re using their own timestamp format (granted its a defacto unix timestamp but still). What timezone guys? GMT? UTC? PST? EST? The timezone the server just happens to be running in?

They’re basically reinventing RSS which just shouldn’t be done. The problem isn’t that Digg can’t do a good job inventing a new format – people already understand RSS. There are already tools that parse RSS and there are cool tools like the RSS validator which don’t work for Digg’s API (but would if they used RSS).

Granted Digg might want to add a few proprietary extensions like total number of diggs but this can be accomplished by adding another namespace to RSS. Apple did this with iTunes and Feedburner does it with a few of their extensions.

Update:

Want to see a good RSS API implementation? Check out Tailrank. Every Tailrank page has an associated RSS feed. Need to build an app on top of our search? Just make a REST call to the correct RSS feed. Here’s a Tailrank search for ‘Linux’ via RSS.


  1. A thought-provoking post, thanks!

    As you know, we have not documented a digg API or the XML format it returns. It certainly will change. Whether it’s practical to use RSS with namespaces for everything, I don’t know. But I’ll sure consider it.

    Personally, I’d like to make sure we also support XML-RPC and SOAP, as well as JSON via REST, to enable the widest range of developers. But I’m just a lowly programmer, so I can’t make any promises.

  2. Unix timestamps are by definition UTC, which is what makes them a sane choice in a world gone mad.

    Regarding RSS, I think you’re wrong. This API definitively does not reinvent RSS – yes, there are stories in there. Yes, they are sometimes sorted by date. But that’s just one part of a larger picture, and cramming it into a thin veneer over RSS would just increase the overall verbosity of a response, with no tangible benefit I can see. Why are the semantics of RSS appropriate here? Namespaces are a lot of complexity overhead just to get validation compliance with an unrelated tool.

    Regarding REST, I’m totally in favor, but I can’t see anything unRESTful in this API.

  3. Michal.

    Nearly the entire industry would disagree with you. People have standardized on RSS because if every site like Digg were to invent their own format the entire world would be a lot less sane (since you’re a fan of sanity).

    The Digg format here is nearly a 1 to 1 mapping with RSS. Even with namespaces there would probably be a less than 10% addition of file size.

    There are parsers which ARE namespace aware and can easily expose the information to callers.

  4. Meh, I dunno. The Flickr API is a really nice piece of work, and it’s not RSS either. What about XOXO or Atom? Help me out, post some links to the entire industry so I can understand why plain old XML isn’t a good enough format for API responses.

  5. I haven’t fully evaluated the Flickr API…. I can run over and take a look.

    XOXO or Atom might work of course…. the issue I have is that the format is SOOOOO close to RSS and would really only require changing the names of some elements and POOF you’d have RSS!

    :)

    There was a similar controversy a few years back where news.com released their own RSS feed format. It was the same situation where it was an XML feed with non-standard date formats and so forth.

    They eventually adopted RSS…

    I can’t find the link though :-/ …

  6. Michal,

    If Digg adopted RSS, they’ll immediately have countless tools and services that can distribute/present Digg content.

    If they don’t, they’ll maybe have a handful in a month, most if not all of which will be wallowing in experimental status.

    Frankly, I think this is a great self-sacrificing opportunity for Digg to show rest of the world why it’s not a good idea to choose proprietary format over RSS.

  7. The utility of standard API’s is just so high that developers *should* look for appropriate standards before they do just about anything on the network if it involves a public interface. In this case, it looks like Digg should either settle on RSS or Atom (Atom is preferred) and use the defined extension mechanisms in order to add whatever service specific stuff they might have. Network services are best when they fit well within the ecosystem of existing and projected applications. Services that refuse to follow or respect standards should be condemned by those who care about the quality of the network.

    bob wyman

  8. Kevin, Don, Bob – you all make good points. I’m still not entirely convinced that everything “list-like” needs to be shoehorned into RSS or Atom. Digg has provided RSS feeds pretty much since it launched, but an API is something else. Clarity of purpose and internal consistency ought to win out over wire-format advocacy, and there’s more there now than just delivering lists of stories. If you watch what goes over the wire from the Labs applications you’ll find responses that are a poor fit for RSS, like aggregated activity (e.g. http://services.digg.com/story/488562/activity?min_date=1153803600&max_date=1153976400&period=hour).

    That said, there’s clearly a lot of format overlap: titles, descriptions, links would be the main ones, good candidates for RSS or Atom. I think the number of extensions necessary (diggs, comments, two kinds of links, containers, and topics) would make for a real namespace mess. My own experience with parsers for syndication formats has been heavily influenced by Python’s Universal Feed Parser, which exposes only namespaces from a predefined list, hardly appropriate for parsing RSS+Digg. PHP’s Magpie does a little better, but it still does what most feed consumers want: extract a common logical structure out of a variety of input formats. These kinds of social expectations around syndication formats are just not a good match for the range of data Digg is publishing.

  9. The way it is right now if you wanted to use Digg’s API with python you’d have to write your own parser. If it was in RSS you could use Universal Feed Parser.

    If it were to use proprietary namespaces you’d just be able to extend UFP instead of having to write your own parser from scratch.

    Kevin

  10. I agree entirely that RSS would be preferable the current format (assuming care was taken over identifiers, escaping etc). But it’s worth remembering that RSS only covers GETs.

    Presumably at some point it will be desirable to send info off to Digg. Going the RSS route this would effectively mean a shift to a whole different stack (either XML-RPC, Atom Protocol or something home-made).

    I think there’s a strong case for using Atom Format/Publishing Protocol. In the immediate term the format supported everywhere RSS is. Longer term the APP offers a cleanly extensible approach without reengineering. Check GData.

  1. 1 Classyfeeds

    REST and RSS are all we need for APIs

    The title makes me drowsy.
    Kevin Burton of Tailrank has some good advice for Digg :
    Theyre basically reinventing RSS which just shouldnt be done. The problem isnt that Digg cant do a good job inventing a new format – people …