Chaos & Insanity


Last week Splunk sponsored ComputerWorld’s Infrastructure World conference along with HP and IBM. I needed to come up with a talk and I wanted to do something new.

I’ve been thinking about how to describe the challenges we have managing all this changing technology and innovation. Note this is seriously a work in progress. I’m developing a theory that there are three fundamental drivers to data center chaos.

  • expectations,
  • complexity and
  • accountability

Any new business or consumer technology can be quickly met with significant expectations if it becomes successful. Our dependence on everything from wireless email, online travel reservation systems and hosted software as a service dramatically increases the expectations these technologies will always be available, fast and do everything we want. Examples of failed expectation are everywhere. A few examples. On June, 20th United Airlines canceled 24 flights and delayed another 286 flights due to a “computer gremlin.” Research in Motion recently experienced yet another 24 hour email outage and more than 2.5M users were without service in North America., pioneers of Software as a Service (SAAS), a more reliable alternative to running it yourself continue to have outages as well.

Rising expectations, success and dependency force increased complexity in both scope and scale to meet demand. Scope complexity abounds as more and more features and capabilities are added to the services we depend on. I used an example of Citigroup’s internal SOA architecture that has five federated ESBs — one of every technology flavor. Scale complexity occurs as infrastructures grow so large they begin to stress under their own weight. for example is now processing more than 90M transactions a day through their web interface and AppExchange platform. At a meager 10 messages per transaction that’s almost a billion messages a day going through the infrastructure. Wow. Imagine finding a needle in that haystack.

Finally once popularity rises and the technology become established, accountability arrives. Now we have to worry how safe is the technology and in many cases monitor what people are doing with it. Everyone by now knows of the TJX situation where 45.7M credit and debit card numbers were stolen by hackers that somehow infiltrated its processing systems. The first card numbers were stolen three years ago and still there is no definitive explanation. Everything from cracked WEP keys, software tampered kiosks and insider job have been offered as possible causes. More recently TDAmeritrade and have experienced similar breaches of user and account information totaling into the millions. And compliance is everywhere. SOX, PCI, ITIL, HIPAA, FFIEC, FISMA, ISO, CoBIT, COSO and other mandates means IT staff have reduced access and visibility into the systems their trying to manage and keep running.

expectations + complexity + accountability = chaos

I’m interested in your thoughts on the direction this is taking. I’ll be sure to blog more later as the ideas develop.

Posted by