Splunk at Financial Services Company

End-to-end Visibility Delivers $6,000,000 in Annual ROI

The Business

This leading, privately held financial services company is based in the US with more than 50,000 worldwide customers. Hundreds of millions of users across the globe rely on the company's international services and information resources.

Challenges

The company leverages a large number of distributed applications that generate millions of events per day. The task of wading through all this data to troubleshoot problems was a manual process that was both time-consuming and very costly. In a business where downtime costs tens of thousands of dollars per minute, the company needed a better way to sift through huge amounts of information to fix a problem.

Enter Splunk

This global organization deployed Splunk to gain better visibility across its large infrastructure and to analyze and troubleshoot problems more quickly.

With Splunk installed, the company is now able to collect a wide range of machine data including custom and third-party software, servers, c language code, Java code, operating systems, databases, and more. Splunk streamlined the company's ability to search, find, analyze and resolve issues quickly. "We paid for Splunk in the first month. Prior to Splunk, the monthly outages averaged one hour and ten minutes. After Splunk, we got the outages down to 15 to 20 minutes per month," said the company's Open Systems team lead. Splunk allows the organization to get ahead of the break-fix model and proactively find issues before they become downtime-causing problems, "Now, we're actually proactively solving problems. Using Splunk, we are able to see sequences of events that lead up to the outages."

Breakthroughs

Splunk enabled the financial services company to decreased monthly outage time from 70 minutes to 10-15 minutes. They reduced mean time to repair from 50-60 minutes to about 20 minutes. The company also now is able to conduct periodic service level reviews for continuous service improvement. The annual savings from the reduced downtime alone saves the company $6M annually.

In addition to experiencing better customer satisfaction with less downtime, the company also realized the following benefits using Splunk:

  • Continuous service improvement - The idea of periodic service level reviews, accompanied by reviews of potential areas of service improvement, is a focus area in ITIL v3. There aren't a lot of tools to support this yet; however, the capabilities provided by Splunk are a good starting point.
  • Improved efficiency - The team has become more efficient. As an example, on the day of the interview they were having problems with server authentication. They were able to narrow down and solve the problem quickly. "If we had to go to every server and check its log, it would have taken us all day."
  • Better coordination between Development and Operations (DevOps) - Developers are using the product as well. They create searches as they develop code. Searches can be used to debug during development, and can also be used to troubleshoot once applications go into production.