
XML is a great format for exchanging information because it balances readability, extensibility, and compatibility across heterogeneous environments. However, its flexibility is also a disadvantage because it is far too easy to create a proprietary XML schema, resulting in lots of custom code to interface with various systems. Lots of custom code leads to brittleness, and brittleness leads to frustration. The key to salvation lies in standardization.
Enter the Atom standard: a standards-track schema that defines a generic collection/item container format in XML. Most people equate Atom to an RSS competitor, which is true, but that only covers half of what it does. The Atom Publishing Protocol is a well-defined protocol for performing CRUD (Create, Read, Update, Delete) operations on items over HTTP. The Atom Syndication Format, which is the most commonly used portion, defines the XML schema used to deliver data during a Read operation. Atom was spearheaded by Sam Ruby, and is now back by people like Brad Fitzpatrick, Tim Bray, Jeremy Zawodny, Mark Pilgrim, and is heavily implemented by Google.
Like most software systems, the majority of Splunk’s internal entities can be loosely viewed as a collection of similar items. The requested searches, configuration information, saved searches, users, roles — all just collections. So instead of creating five separate XML schemas for each of these collections that perfectly describe their contents, I chose Atom to serve as a single generic container to describe all of the entities. This kind of reuse is echoed by Pat Helland of Amazon, who gives a great talk on relating the rise of the industrial age to standardization, and Tim Bray (Mr. XML himself), who advocates against creating your own XML unless absolutely necessary.
The benefit of sticking to a standard is that there is a much greater chance that external developers already know exactly how to consume your data with very little work. Not only are language-level Atom parsers available everywhere, but entire applications have been specifically built to consume Atom. For instance, here’s a screenshot of the NewsFire feed reader displaying all of the searches that exist on my local Splunk server:
All I had to do was to supply a URI and login to NewsFire, and then it took care of the rest. No XSLT, XPath, or custom DOM iteration necessary; it just works. As far as I know, Splunk is one of a handful of enterprise companies that has integrated Atom at such a core level. Hopefully, for you it means that there is one less bucket of tag soup you have to deal with, and one better product that you enjoy using.