
(This post was written by Dan Woods, CTO and Editor of CITOResearch.com.)
Last week, while at the Splunk .conf 2011, I did a research experiment and asked conference attendees to explain Splunk in one sentence. (See “Explaining Splunk in One Sentence”).
I did my experiment on the first day of the conference before hearing the vision for the product at the keynote sessions. The question I will answer in this blog is: “What are people likely to say Splunk is in 2012 after the company has spent 12 months executing toward its vision?”
To set the stage for my predictions, I would like to explain what I learned about the challenging task of explaining Splunk during my three days at Splunk .conf 2011.
The first problem that Splunk has and has always had with respect to explaining itself is the ambition of its founders. Rob Das and Erik Swan set out not to build a product but to build a platform, and platforms are hard to explain because they do so many things. The most common one sentence explanation of Splunk, “Google for IT Data” gets one aspect of the platform across but leaves many more out, especially the uses of Splunk outside of IT for operational intelligence purposes. What about capturing best practices and other knowledge for operational analysis and support in an application for enterprise security or PCI Compliance? What about being able to quickly boil down massive amounts of data into a summary index that can support further analysis using data visualization tools like Tableaux, QlikView, or TIBCO Spotfire? or business intelligence tools like Pentaho, Jaspersoft, SAP Business Objects, Oracle Hyperion, or IBM Congos? What about replacing network monitoring tools with Splunk? What about replacing Google App Engine with Splunk and Amazon Web Services? What about arming a saleforce with tools powered by Splunk that can prove the operational quality of your product? The term “Google for IT data” is the perfect start, but it cannot contain all the value that Splunk is starting to create.
One of the Splunk’s partners from Amazon put it this way, “If Splunk had chosen one of the problems it could solve and made a business out of that, they would have been bought up already.” But noooooooooo. Rob and Erik did not want to solve just one problem, but dozens, and that’s why explaining Splunk is hard, because it is a powerful and multifaceted platform.
The second problem in explaining Splunk, one that Godfrey Sullivan, Splunk’s CEO brought to my attention, is that the nature and potential value of machine data is challenging to get across. He’s right. The words machine data are like a shibboleth, separating the insiders from the uninitiated.
I’ve been thinking about the most practical way of explaining the value of machine data and I think I’ve hit on a good simple example. Machine data usually tracks that tiny movements that can reveal larger patterns. Once of the easiest to understand sources of machine data are the call detail records kept by phone companies. We could create a very accurate of a person’s actual social network by creating a graph of who they called and who called them. If we added detail records for texting, IM, and email, the social network would become even more accurate. Of course, we are talking a privacy nightmare if this example ever came to life without our consent, but that’s not the point. Machine data is about taking a micro level of detail and using it to tell a bigger story.
The third problem in explaining Splunk is the fact that almost any cool usage of Splunk emerges out of a combination of three things:
- An understanding of the relationships expressed in the data being analyzed.
- An understanding of the core indexing and search pipeline used by Splunk.
- An understanding of the how the Splunk search language controls what happens in the Splunk data pipeline.
In other words, to describe a nifty thing Splunk can do, there is a lot of context that must be explained, much of it complex, some of it subtle.
I compare using Splunk to playing chess. There is an opening, a middle game, and an end game. In Splunk, like in chess, there are endless possibilities in each phase of the game. Like chess, there are specific skills to master to rise in competence. Like chess, there are many different ways to achieve your goals and win the game.
The opening of Splunk is the indexing of the data and the identification of certain basic fields such as the time stamp, the source, the source type, and the host. For well understood source types, lots more fields are identified. Once the data has been indexed, the middle game begins, in which the data is transformed, cleaned, summarized, correlated, enhanced and so on. The Splunk Search Language controls this stage. (I am helping David Carasso, chief mind of Splunk write a book about this you can find out more about in this post: Explaining Splunk through Recipes.) The end game is the creation of the result which is usually a visualization such as a report or a graphic or it could be some other form of summary such as a look up table, a summary index, or some other special purpose form of output.
The problem is that in playing chess or using Splunk it is a sequence of moves that gets you to where you want to go. Given that the number of potential moves is vast, explaining what is possible is difficult. That is one reason we are focusing on recipes in the book currently underway.
The good news for the Splunk community and for future users of Splunk is that Splunk’s vision is to focus more precisely on ways Splunk will create value for specific purposes and use cases. The number of Splunk applications will rise dramatically. Splunk will get easier to use. Splunk users will make use of Splunk Storm, a special version designed for use in the cloud that makes it easy to monitor and analyze cloud-based IT assets.
So, if Splunk keeps executing as it has, I suspect that the one sentence explanations I may harvest next year may include the following:
“Splunk is the way I manage and optimize Microsoft Exchange.” (or F5, or Cisco Iron Port, or VMware, or Double-Take, or many other products.)
“Splunk is my platform for operational intelligence applications.”
“Splunk is a better way to implement a star schema to summarize data for reporting.”
“Splunk is how I run my cloud-based applications.”
“Splunk is the easiest way to evaluate the potential value of large data sets.”
“Splunk gives me a real time view of the marketing impact of promotions on my web site.”
“Splunk gives lets me get value out of big data at a fraction of the cost of massive business intelligence suites.”
“Splunk is the fastest way to synthesize and make sense of data from many diverse sources.”
“Splunk is the glue that connects all our information assets.”
“Splunk is my solution for ad hoc operational dashboards.”
“Splunk cuts development of advanced real time analytic dashboards from months to weeks or days.”
“Splunk is my one console for monitoring and optimizaiton of on-premise, private cloud, and public cloud IT assets.”
“Splunk replaced 7 purpose built IT products and provided more capability than all of them combined.”
“Splunk makes getting value from big data as cheap and easy as possible.”
Some people may regard Splunk in some of these ways already. I suspect at Splunk .conf 2012, people will describe Splunk in these ways and many more as well.
Please send along your predictions for how Splunk will be described to dwoods@CITOResearch.com