Case Study

Seattle Cancer Care Alliance Gains Visibility Into NetApp Storage Systems

Executive Summary

Seattle Cancer Care Alliance (SCCA) is a cancer treatment center that unites doctors from Fred Hutchinson Cancer Research Center, UW Medicine and Seattle Children’s to lead the world in the prevention and treatment of cancer.  SCCA’s IT infrastructure team needed to gain comprehensive operational visibility into the company’s datacenter and enterprise infrastructure. Since deploying the Splunk platform, SCCA has seen benefits including:

  • A single pane-of-glass view into systems throughout the enterprise
  • Reduced troubleshooting times
  • Critical insights into storage system health
Challenges
    • Lack of operational visibility into the company’s datacenter and enterprise infrastructure
    • No unified view across all systems
    • Critical need for understanding and pinpointing the source of performance degradations 
Business Impact
    • Operational visibility into enterprise infrastructure
    • Comprehensive insight into performance metrics
    • Significantly improved troubleshooting efforts/MTTR
    • Reduced storage monitoring costs 

Data Sources
    • NetApp performance, logs and configuration data
    • Security events
    • Windows & Linux events/Network Syslog
    • Postfix mail logs
    • Juniper SRX structured logs 

Why Splunk

With no unified view across all systems, understanding the historical behavior of NetApp filers and pinpointing the exact source of performance degradations was proving a challenge for SCCA’s IT infrastructure team. In addition, troubleshooting storage issues and gaining insight into performance metrics—such as latency—involved creating and running custom scripts via several different monitoring tools. Splunk Enterprise has proved to be a powerful and easy-to-use solution that even novice IT admins at SCCA could understand and use in daily monitoring tasks.

Proactive reporting to identify potential infections

SCCA first started using Splunk Enterprise for analyzing machine data from firewalls, Windows servers and mail server events data. The company then began to use the software to get proactive reporting out of NetApp’s vscan API, which the team uses for anti-virus scanning on CIFS shares to identify potential infections.

Instant visibility into storage systems with the Splunk App for NetApp Data ONTAP

Deploying the Splunk App for NetApp Data ONTAP has proven extremely useful for SCCA. The app has enabled the IT infrastructure team to monitor all NetApp filers from one central location and get instant visibility into the health of SCCA’s storage systems. Without having to deploy multiple monitoring solutions, SCCA can now analyze storage performance trends either in real time or over a desired period. The team is able to spot issues such as abnormal latency of a particular volume and compare it with other entities of interest.

With a higher-level overview of all the NetApp systems across the enterprise, the SCCA IT infrastructure team can now drill down to a particular incident and easily isolate the source of the degradation at any point in time. In many cases, the value lies in the ability to produce real, targeted data that shows that storage is not the cause of a performance problem. This visibility and clarity have helped SCCA reduce troubleshooting times significantly, enabling senior IT experts to focus on more complex and productive tasks. 

“Using the Splunk App for NetApp Data ONTAP, we gain instant visibility into what is happening in our storage systems. Splunk software gives us the ability to analyze storage data in the context of all our machine data, including our operational and security data. Splunk is the only solution we found that allows us to quickly see, analyze and correlate our data without having to be data or Splunk experts.”



IT Infrastructure Lead, Seattle Cancer Care Alliance

Analyzing operational and storage data for security

Splunk software’s open, extensible platform has allowed SCCA to analyze its storage data and other important machine-generated data in the context of security risks. As a result of this comprehensive operational visibility, SCCA’s operations staff is consistently able to produce lower mean-time-to-repair (MTTR).