While catching up on the long tail, Chris Anderson’s blog where he explores his thesis about the impact of digital distribution on mass media products, I realized most IT people take what Chris calls a blockbuster attitude when it comes to deciding what log sources to centralize.
(The basic long tail idea is that in the past, when the cost of distributing each movie title or album was fairly high, only the most popular items could be profitable. But with digital distribution, suddenly the aggregate profit of all of the more niche stuff may be larger than the hits. You profit by having everything anyone might want in stock.)
When it comes to building some sort of central log host, sysadmins focus on getting the most-used logs in – syslog, apache, and maybe appservers. Within these logs, they take the same blockbuster attitude to decide which messages from a particular log to classify. This is only because of the overhead of manually interpreting each message.
Chris’ point about the long tail holds here, too. There can be more value, in aggregate, in the obscure logs and uncommon events. It’s the rare things, the messages you didn’t expect or that never happened before, that may be the best clue to an operations problem.
Splunk is to log parsing what digital distribution is to mass media. It drives the marginal cost of making sense of another log message or format to nil, so that the huge aggregate value of the rare stuff can be realized.
It ties into what we learned at Interop New York in December. We should have turned logging way up and dumped every event possible into Splunk. But just wait for Interop Vegas late this spring!
(Sorry, the econ TA in me took over for this post 😉