Case Study: Netcordia
|
The introduction of the NetMRI Event Analyzer was pain-free because we used the Splunk IT Search platform. We avoided all the teething pains that would have come with building a new IT data management platform in-house. - Jay Ennis, Executive Vice President, Product Development |
Capitalizing on an Insight, Fast
Netcordia is a rising star in the world of network change and configuration management. Hundreds of customers including the US Army, Citizen's Bank, Texas A&M University, and the NSA use their successful NetMRI solution.
Like most network management products, NetMRI initially relied on SNMP and configuration data to provide monitoring and diagnostic capabilities. But Netcordia's well-known founder Terry Slattery had the insight that event logs are underutilized in network management and troubleshooting.
So Netcordia, renowned innovators with a history of ingenuity, saw a new way to manage the network by merging performance, configuration and event data into a single, complete view of the network. Netcordia used Splunk to do it, called it the NetMRI Event Analysis, and introduced a new product in 120 days, with overwhelming customer approval. Here's their story.
Defining the Opportunity
In early 2007, Netcordia identified a critical mass of customers that shared Terry Slattery's insight, wanting the ability to make effective use of event log data. Netcordia saw this as a new revenue opportunity based on delivering incremental and complementary value propositions to a large segment of its customers.
At the same time, they saw that tight integration between a new event analysis capability and the existing NetMRI monitoring capabilities would be a new and powerful differentiator for the existing, already successful product. Knowing time-to-market would be critical. Netcordia moved on the new product, the Netcordia NetMRI Event Analysis, defining requirements, and designing the implementation.
Easier Said than Done
The engineering team at Netcordia soon realized there's a reason event logs are underutilized for network monitoring and diagnosis – building a platform that lets you monitor and search them in the volumes generated across most large networks is immensely difficult. And there are few models for interfaces that make this data easy enough to navigate and understand.
Netcordia's Executive Vice President of Development, Jay Ennis, had his team research possible designs for a new appliance with the ability to capture and store large volumes of event traffic, alert based on patterns in the event logs and provide basic queries through a limited web interface. The team concluded it would take 18 months of labor and nine calendar months to bring to market …without any bells and whistles. This simply wasn't fast or good enough to meet Netcordia's high product standards and seize the window of opportunity. Jay went back to the drawing board. The bulk of the development effort was in building a platform for storing data in the file system they could run queries against to drive alerting and ad hoc investigations. He wondered if such a platform already existed. And he dared hope that if it did, it would give him a leg up on the user experience, too.
Splunk to the Rescue
That's when Jay discovered Splunk. Splunk is an IT Search platform that indexes and lets users search, alert and report on IT data. Splunk is a proven solution with over 550 enterprise and government customers and a growing user community drawn from the over 125,000 downloads since its initial introduction in 2005. And, the Splunk software platform is also developer and partner friendly, with a small footprint and open APIs for rapid development of new applications that leverage its capabilities.
Netcordia's engineers got going right away - they tried a live demo, downloaded Splunk on their own, and read user and developer documentation on Splunk.com. It was clear Splunk had all the capabilities they needed, and more. The Splunk Web interface opened up a whole new way of thinking about the experience of looking at log data. Now they might even surpass the original goals for the new product.
The introduction of the NetMRI Event Analyzer was practically pain-free because we decided to use the Splunk IT Search platform. We avoided all the teething pains that would have come with building a IT data management platform in-house.
Success: NetMRI Event Analysis Introduced September 2007
With Splunk under-the-hood, and Netcordia's expert knowledge of network issues out front, NetMRI Event Analysis was introduced just four months from inception in May 2007 to release in September 2007. The project required less than eight months total development effort, most of which went toward packaging and testing for the appliance-based offering. Overall, Splunk accelerated time-to-market by four months, shaving off ten months of software development effort, and enabling the delivery of a stronger product than would have been possible otherwise.
NetMRI Event Analysis is a separate appliance that receives and analyzes network device event streams (syslog and SNMP traps) on behalf of the base NetMRI appliance to guarantee high performance of the overall solution. Netcordia took advantage of Splunk's modular API to extend Splunk's indexing with a special C language processor that incorporates Netcordia's knowledge of network device logging and enriched the events with additional fields. The Splunk software provides dense indexing and persistence for the events.
Then NetMRI Event Analysis uses Splunk's search API to execute searches against the indexed data on a periodic basis to look for a wide array of network symptoms. This is where Netcordia's extensive knowledge of network management comes in - their experts know what to look for in network syslog in order to find the one-in-a-million log events that are the first indicators of serious problems.
Users can customize the list and timing of symptoms that NetMRI looks for within events captured by NetMRI Event Analysis.
The NetMRI software also runs periodic searches in order to summarize the syslog events within Splunk along various dimensions. The summary is then stored on the main NetMRI appliance and browsable within the native NetMRI console.
Users who need to do ad hoc analysis of events in NetMRI Event Analysis can drill down to a re-skinned Splunk Web interface from links in the native NetMRI console. Within the Splunk Web interface they have complete flexibility to do ad hoc searches and navigate search results. Splunk's fast search and interactive interface put the NetMRI Event Analysis module ahead of the competition.
About Netcordia
Netcordia is a leading provider of network automation software to the world's most complex and mission-critical networks. Its award-winning NetMRI network change and configuration management (NCCM) solution continuously audits multi-vendor infrastructures, identifies anomalies early and speeds resolution. Netcordia helps more than 200 leading healthcare, financial services, academic, service and government organizations stretch IT budgets, improve overall performance, meet corporate policy and comply with stringent federal regulations.
Netcordia was founded in 2000 by Terry Slattery, longstanding expert on IP networking and Cisco, and co-author of "Advanced IP Routing in Cisco Networks," He is a sought after industry speaker and advisor.
Netcordia has been recognized as one of the top five companies that is changing network management by Computing Magazine, one of the top companies to watch in network management by Network Computing, and has also one of the Fierce15 top emerging companies in IP Telephony. Clients include Fortune 500 companies, leading universities, the United States Army, several leading civilian agencies, state agencies and leading organizations in finance, health care, transportation, media, consulting and high tech.