Network management and performance monitoring have never been harder to tackle. Today’s organizations are dealing with increasingly complex networks — they have offices that span the globe, employees working from home and an increasingly vast number of devices to manage and monitor.
In any network, problems can come from almost anywhere — even seemingly small issues can lead to downtime that can slow productivity and your ability to meet customers’ needs.
Recent reports estimate that one minute of downtime can cost an enterprise $9,000. Network outages hurt revenue, kill productivity and tarnish the reputation of both your IT team and the larger organization.
That’s where network operations centers (NOCs) come in.
In this article, we’re diving into NOCs, the role they play in performance and security, best practices and how they differ from security operations centers.
What Is a Network Operations Center (NOC)?
A network operations center (NOC) is a centralized location where IT teams can continuously monitor the performance and health of a network. The NOC serves as the first line of defense against network disruptions and failures. They’re designed specifically to prevent downtime so that customers and internal end users don’t even realize when inevitable incidents or outages do occur.
Through the NOC (pronounced “knock”), organizations gain full visibility into their network, so they can detect anomalies and take steps to prevent problems or quickly resolve issues as they emerge. The NOC oversees infrastructure and equipment (from wiring to servers), wireless systems, databases, firewalls, various related network devices (including IoT devices and smartphones), telecommunications, dashboards and reporting. The NOC plays a huge role in ensuring a positive customer experience, with management services that monitor:
- Customer support calls
- Help desk ticketing systems
- Integration with customers’ network tools
NOCs can be built internally and located on-premise, often within the data center, or they can be outsourced to an external company that specializes in network and infrastructure monitoring and management. Regardless of the design, NOC staff are responsible for spotting issues and making quick decisions on how to resolve them.
How NOCs maximize availability
Simply put, the goal of any NOC is to maintain optimal network performance and availability and to ensure continuous uptime. The NOC manages a host of critical activities, including:
- Monitoring the network for problems that require special attention, including those originating from outside sources
- Server, network and device management, including software installation, updates, troubleshooting and distribution across all devices
- Incident response, including managing power failures and communication line issues
- Security, including monitoring, threat analysis and tool deployment, in conjunction with security operations
- Backup and storage; disaster recovery
- Email, voice and video data management
- Patch management
- Firewall and intrusion prevention system management and antivirus support
- Policy enforcement
- Improvement of services through the collection of feedback and user recommendations
- Service level agreement follow through
- Vendor, freelancer and contractor management
To accomplish all this, a NOC will often take a hierarchical approach to incident management. Technicians are categorized — Level 1, 2 or 3, typically — based on their skill and experience in resolving specific issues. Once a NOC technician discovers a problem, they will create a ticket that categorizes the issue based on alert type and severity, along with other criteria. If the NOC technician assigned to a specific problem level fails to resolve it quickly enough, it moves up to the next level and continues to escalate until the ongoing issue is fully resolved.
This hierarchical approach is just one common way NOCs operate, but there are a few recommendations for making sure your NOC is ready for success — let’s explore.
(Learn about incident severity levels and how teams target their incident response.)
NOC best practices
Best practices for a network operations center prioritize training, rely on clearly defined roles and establish clear protocols and means of communication.
Make training and knowledge development a top priority
Your NOC team must have high-level expertise, specifically in monitoring, managing and resolving issues specific to network performance and your IT infrastructure. Provide robust and frequent training on procedures and protocols for any event, and keep up with the changing tech landscape and changes to your IT environment. Prioritize network performance issues, but don’t overlook procedures for collaborating with your SOC on security issues.
A key procedural issue is escalation — make sure your staff knows how and when to make the quick decision to escalate a growing problem to a more experienced teammate.
Define clear roles
Flatter organizational hierarchies are more popular these days. In the fast-paced, must-act-now world of network monitoring, it makes sense to empower each team member rather than rigidly insisting on rank- or role-based handoffs. Yet while technicians should be equipped with the knowledge and authority to act quickly to prevent network failures, you still need escalation tiers and shift supervisors to oversee the NOC.
While NOC technicians should be left largely to do their jobs and offer insight — and certainly they should not be micromanaged — you need a leader who assigns work to technicians based on their skills, prioritizes tasks, prepares reports, ensures incidents are being resolved properly and notifies the broader organization of events as needed.
Additionally, each technician should know specifically what tasks will be expected of them, their level and the line of reporting should they need to escalate an incident or respond to one.
Enable strong communication
Keeping the lines of communication open — within the NOC, SOC and other external teams — can prove to be challenging. It’s more than just setting up a few periodic meetings, it takes a concerted effort to train staff as to how and when to share information and to hold them accountable for doing so. Creating regular opportunities for collaboration and coordination is key to a solid NOC.
Establish clear guidelines and protocols. Keep things running smoothly by creating clear-cut policies for the following:
- Incident management: Document steps the technicians should follow to handle incidents (e.g. when the technician can make the decision to escalate or when to notify team members).
- Solutions: Outline procedures for dealing with common issues and provide immediate methods for dealing with emergencies.
- Escalation: Determine how the team should escalate issues and to whom.
- Prioritization: Establish which incidents take the highest priority and which technician level should handle the most important ones. Incidents should be ranked based on the extent they will affect the business.
Having well-established protocols ensures everyone is on the same page, provides consistency across the organization and increases accountability among NOC staff.
Of course, having the right people and processes in place lays the foundation, but the actual work can’t be done with the right tools.
How do you choose the right tools for your NOC?
Which tool you invest in is largely dependent on your business needs, but your NOC requires a tool or combination of tools that provide the following:
- A comprehensive view of your infrastructure: Whether it’s physical, virtual or in the cloud.
- Automation: To cut down on repetitive tasks, freeing up Level 1 staff to focus on higher-priority issues while also reducing alert fatigue.
- Ticket management: So you can view information related to open tickets, including priority tasks and the assigned technician, to ensure you resolve internal and external issues quickly.
- Incident reporting: A tool that provides visual analysis, and graphical representation of thresholds, alarms, indicators and trends makes it easier to investigate issues and document them for the future.
- A simple interface and deployment: You want to see immediate benefits and not have to battle through a lengthy, complex deployment and a long learning curve.
- Scalability: As your business grows, you want to ensure the NOC can handle it.
The tool you choose should offer you full visibility across your entire network and enable you to drill down deeper, investigate issues and improve your overall incident response as time goes on.
What is the difference between a NOC and a SOC?
While the NOC is focused on network performance and availability, a security operations center (SOC) consists of tools and personnel who monitor, detect and analyze an organization’s security health 24/7/365.
Technicians in the NOC are searching for issues that could impede network speed and availability, while technicians in the SOC are tasked with rooting out cybersecurity threats and responding to attacks. The SOC is focused on protecting customer data and intellectual property as well. NOCs tend to deal with network events that are common and occur naturally, whereas SOCs are almost always responding to outside threats targeting the enterprise network.
Both the NOC and SOC serve critical functions for the organization — to identify, investigate and resolve issues — and both work hard to resolve problems quickly before they impact the business. Additionally, both tend to operate similarly using a hierarchical approach to resolving incidents. However, they focus on very different issues. As a result, the skills, knowledge and approaches of personnel in both groups are also different. A NOC technician must understand the ins and outs of network and application monitoring and management, while a SOC analyst will focus exclusively on security.
That said, SOCs and NOCs should collaborate to work through major incidents and resolve crises, so the two teams shouldn’t be siloed.
Surprisingly, nearly a third of companies report little to no contact between the NOC and SOC, and another twenty percent say the teams only work together during emergencies, according to SANS research. However, experts push for better NOC/SOC integration. Integrating the two — even if they largely remain separate in the day-to-day — starts with establishing operating procedures, automating certain actions, and adopting tools that make it possible to collect and share network monitoring data across both the NOC and SOC.
Can a NOC provide SOC functionality?
When it’s not feasible to establish a separate NOC and SOC, a NOC can monitor and resolve security issues, though it’s not ideal.
NOCs can and do detect security threats as they pertain to network performance, and trained staff can effectively respond to them. That latter point is the key, though. Technicians must be looking for security threats — and they must have the skills to be able to respond to them. Technicians who are highly skilled in both network performance and security are hard to come by.
In addition to the right mix of skill sets, the security-oriented NOC would need the right tools for security problem resolution. For example, a security information and event management (SIEM) system — a single security management system that offers full visibility into activity taking place on your network — is a core tool. SIEM systems collect, parse and categorize machine data from a wide range of sources on a network and analyze the data so you can act in real time. In short, SIEM automates much of the workload of a typical SOC team. This surfaces incidents while also reducing false positives, making it easier for a properly staffed NOC to keep an eye on security.
Whatever the case, the NOC is one of the most important functional teams in IT, with work that is key in maintaining availability, performance and sometimes even security.
What is Splunk?
This posting does not necessarily represent Splunk's position, strategies or opinion.