Event Analytics Helps Econocom Deliver Better Services to Customers
With 10,000 employees in 19 countries and revenue of 2.5 billion euros, Econocom designs, finances and oversees digital transformation solutions for companies. Meeting the strict SLAs associated with those services is critical to Econocom’s business. Since deploying Splunk Enterprise and Splunk IT Service Intelligence (ITSI), the company has seen benefits including:
- Ability to meet all SLAs
- Better service delivered to customers
- Improved event management
- Streamlined IT operations processes
SPLUNK SOLUTION AREAS
- Events generated by tens of thousands of monitored infrastructure components siloed in almost a dozen different solutions
- Strict SLAs with customers to respond to failures
- Unable to measure SLAs effectively with events in multiple silos
- Lack of unified view of events to plan staffing rotas
- Difficulty in acknowledging events during alert storms
- Event pollution
- Large number of false positives
- Improved service delivered to customers through quicker response to events
- Unified view of all events from dozens of legacy monitoring and event management solutions
- Event analytics provide management team with new levels of visibility for staff rota planning
- Reduced number of false positives resulting in 60% event reduction
- Reduced noise and event pollution
- Improved communication and collaboration between different teams
- 10x reduction in number of incidents created by system performance
- Streamlined operations for event processing and incident management
- Accelerated incident investigation
- Providing customers SLA performance visibility
- Improved capacity planning
- Events and alerts generated by multiple infrastructure monitoring solutions
- System performance data
- Application performance data
As a services company supporting critical infrastructure for its customers, Econocom must ensure it can react instantly to any changes in customers’ environments. Econocom also needs to deliver against tight SLAs. Failed SLAs can lead to penalties and loss of customer satisfaction.
Prior to Splunk, the Econocom operations teams had almost a dozen different monitoring and event consoles. “With so many things to look at, it became impossible to manage the events and to prioritize correctly since they were all generated by different systems,” says Romuald Fronteau, technical solution consultant, Econocom. Econocom operates under strict SLAs when it comes to responding to events, but the volume of events across so many silos led to events being untreated within the given SLA window.
Econocom operators also lacked the ability to apply analytics across multiple data sources to accelerate incident investigation, visualize data with business-service context and apply capacity planning.
Splunk ITSI was implemented at Econocom in full production in just weeks, and is now used by a number of different teams within the IT organization.
“Our vision is to evolve from an IT service provider to deliver business-enabling and strategic services. With Splunk ITSI, we are moving closer to that vision. We are able to collect data across multiple systems, solutions, technology tiers as well as end-user performance to gain a holistic single-pane-of-glass view. This allows our operational teams to be more productive, my IT management teams to make better decisions and we have improved the quality of the services we deliver to our customers. Thanks to the integrated machine learning in Splunk ITSI, we now have a reduced number of events to process and the streamlined event analytics framework allows us to process events eight minutes more quickly, on average. This has led to a 15% improvement in SLA performance.”
Technical Director, Infrastructure Management Services
Performance against SLAs improved by 15%
Econocom has been able to centralize events from all its different siloed solutions into a single solution with Splunk ITSI. Event analytics in Splunk ITSI helps Econocom to better prioritize and react more quickly to customers’ infrastructure events, ultimately providing a better service. Splunk ITSI also allows Econocom to exclude events considered false positives from the event management process, reducing total event volume by 60% and helping operators focus on the events that really matter.
The time it takes for an operator to acknowledge and process an event is critical to Econocom and is the basis of its customer SLAs. Having a reduced number of events to process, a single interface and the streamlined event analytics framework in Splunk ITSI allows Econocom operators to process events eight minutes more quickly, on average. This has led to a 15% improvement in the company’s SLA performance.
Event analytics drives staffing efficiency
Event analytics in Splunk ITSI has provided new levels of insights to IT managers. Staffing the operations bridge effectively is critical. Applying advanced analytics to historical event data allows Econocom management to better plan shift rotas for operations bridge staff by predicting likely volume of events for each customer. This drives cost savings while also minimizing the risk that understaffing could result in a missed SLA.
Business service insights speeds root cause analysis
Econocom has adopted a business-service-centric approach to monitoring. Service insights in Splunk ITSI provide Econocom with end-to-end visibility of not only the infrastructure but also the applications customers use, including visualizing application performance data. Splunk ITSI is used in the Network Operations Center (NOC) bridge as well as by operational teams including Level 2 and Level 3 analysts to accelerate root cause analysis of incidents.
Providing visibility into service quality is important to Econocom and the Splunk platform has allowed the company to present the performance of SLAs in real time to its customers.
Machine learning reduces number of system performance events 10x
Econocom has put in place more sophisticated analytics and alerting of the IT infrastructure it monitors for its customers. This has led to a 10x reduction in the number of events generated by system performance issues. Econocom’s approach includes using the adaptive thresholds in Splunk ITSI, which leverage integrated machine learning to learn normal behavior. Spikes in CPU would traditionally trigger threshold breaches, whether or not this really indicated a problem. Now, the machine learning in Splunk ITSI can understand which spikes may be normal in certain circumstances, preventing thresholds being breached and reducing the number of events being created.