Managing events smartly
The next natural step was alert management. Leidos had been using another solution for more than 15 years, which was not only outdated but bare-bones, with a complicated back-end rules language. The company wanted a modern product with out-of-the-box correlation and a rules engine to differentiate critical from minor events. The answer was Splunk ITSI.
“There are days when you get a flood of events; Splunk ITSI prioritizes the events, gives you insight into not only that this is broken but what’s been affected right as you look at the alert screen,” Mahler says. In addition to basic requirements such as consolidating events from its heterogeneous IT environment, detecting and suppressing duplicate alerts, clearing solved alerts and distilling them down to actionable events, the company needed extra functionality such as automatically escalating an alert after a period of time or suppressing one when a device was taken offline on purpose. Leidos achieved all of this with Splunk ITSI.
Today, approximately 20 management systems, from Microsoft System Center Configuration Manager (SCCM) to SolarWinds network management tools, more than 4,500 configuration items (CIs) across 120 IT services and 240 locations worldwide, feed into Splunk ITSI at Leidos, helping the company boil 3,500 to 5,000 daily alerts down to roughly 50 tickets for network and datacenter operations to act on. Passing CMDB information into Splunk ITSI allows different alert displays for different staff.
The bottom line: easier access to more relevant data, with staff time devoted to the issues that matter most. “My most important contribution at the end of the day is that we make a difference, that we provide a service that people find accurate and insightful,” Mahler concludes. “The fact that Splunk has all of the information means that people can get their answers quicker, and more accurately and efficiently.”