What is the purpose of a NOC?
Simply put, the goal of any NOC is to maintain optimal network performance and availability, and to ensure continuous uptime. The NOC manages a host of critical activities, including:
- Monitoring the network for problems that require special attention, including those originating from outside sources.
- Server, network and device management, including software installation, updates, troubleshooting and distribution across all devices.
- Incident response, including managing power failures and communication line issues.
- Security, including monitoring, threat analysis and tool deployment, in conjunction with security operations.
- Backup and storage; disaster recovery.
- Email, voice and video data management.
- Patch management.
- Firewall and intrusion prevention system management and antivirus support.
- Policy enforcement.
- Improvement of services through collection of feedback and user recommendations.
- Service level agreement follow through.
- Vendor, freelancer and contractor management.
Network management and performance monitoring have never been harder to tackle. Today’s organizations are dealing with increasingly complex networks — they have offices that span the globe, employees working from home and an increasingly vast number of devices to manage and monitor.
The volume of users, website traffic and malware can all impact network performance, so the potential for problems can come from almost anywhere. Even seemingly small issues can lead to downtime that can wreak havoc on productivity and your ability to meet customers’ needs.
A few years ago, Gartner released a report saying that one minute of downtime can cost an enterprise $5,600. Network outages hurt revenue, kill productivity and tarnish the reputation of both your IT team and the larger organization. With that in mind, NOCs are designed specifically to prevent downtime, so that customers and internal end users don’t even realize it when inevitable incidents or outages do occur.
What are the key roles in the NOC?
Within the NOC, you will find a team of technicians — NOC engineers, analysts or operators — and likely several team leaders or shift supervisors. NOC staff require specific skill sets in monitoring, maintaining and quickly resolving performance issues within the network. That level of knowledge is typically beyond the scope of the unspecialized IT professional. NOC technicians usually have significant work experience, specifically in network monitoring and tools. Many also have advanced certifications in the field.
How is a NOC designed?
The ideal design will give the NOC its own dedicated room. One wall may be covered in video screens, each displaying a real-time look at general network performance, along with active incidents and alarms. The video displays are set up in a grid and connected so that they can operate as one large, high-resolution unit. The size of the physical NOC space and team depends on the size of the organization and data center.
The video wall is where alerts will first appear, specifically showing technicians where an issue is occurring and what device or line is affected. The video screens may also broadcast news and track weather to allow technicians to plan around ongoing issues that may affect broader network operations. The video wall is also connected to individual workstations throughout the room, where technicians are assigned to monitor a specific technology or pain point. From there, technicians can drill down on related issues and follow protocols that have been developed to resolve the incident.
Each workstation includes multiple monitors, making it quicker and easier for technicians to analyze information and respond more efficiently. Each station is also connected to a PA system of sorts, making it possible for technicians to communicate with one another and share information in a timely manner. Technicians can also place alert details on the video wall screen for everyone to review.
In large enterprises, you will often find a separate room that is dedicated to a team that manages serious network incidents.
Typically, a NOC will take a hierarchical approach to incident management. Technicians are categorized — Level 1, 2 or 3, typically — based on their skill and experience in resolving specific issues. Once a NOC technician discovers a problem, he or she will create a ticket that categorizes the issue based on alert type and severity, along with other criteria. If the NOC technician assigned to a specific problem level fails to resolve it quickly enough, it moves up to the next level and continues to escalate until the ongoing issue is fully resolved.
What is the difference between a NOC and a SOC?
While the NOC is focused on network performance and availability, a security operations center (SOC) consists of tools and personnel who monitor, detect and analyze an organization’s security health 24/7/365.
Technicians in the NOC are searching for issues that could impede network speed and availability, while technicians in the SOC are tasked with rooting out cybersecurity threats and responding to attacks. The SOC is focused on protecting customer data and intellectual property as well. NOCs tend to deal with network events that are common and occur naturally, where SOCs are almost always responding to outside threats targeting the enterprise network.
Both the NOC and SOC serve critical functions for the organization — to identify, investigate and resolve issues — and both work hard to resolve problems quickly before they impact the business. Additionally, both tend to operate similarly using a hierarchical approach to resolving incidents. However, they focus on very different issues. As a result, the skills, knowledge and approaches of personnel in both groups are also different. A NOC technician must understand the ins and outs of network and application monitoring and management, while a SOC analyst will focus exclusively on security.
That said, SOCs and NOCs should collaborate to work through major incidents and resolve crises, so the two teams shouldn’t be siloed. Surprisingly, nearly a third of companies report little to no contact between the NOC and SOC, and another twenty percent say the teams only work together during emergencies, according to SANS research. However, experts push for better NOC/SOC integration. Integrating the two — even if they largely remain separate in the day-to-day — starts with establishing operating procedures, automating certain actions, and adopting tools that make it possible to collect and share network monitoring data across both the NOC and SOC.
Can a NOC provide SOC functionality?
When it’s not feasible to establish a separate NOC and SOC, a NOC can monitor and resolve security issues, though it’s not ideal. NOCs can and do detect security threats as they pertain to network performance, and trained staff can effectively respond to them. That latter point is the key, though. Technicians must be looking for security threats — and they must have the skills to be able to respond to them. Technicians who are highly skilled in both network performance and security are hard to come by.
In addition to the right mix of skill sets, the security-oriented NOC would need the right tools for security problem resolution. For example, a security information and event management (SIEM) system — a single security management system that offers full visibility into activity taking place on your network — is a core tool. SIEM systems collect, parse and categorize machine data from a wide range of sources on a network and analyze the data so you can act in real time. In short, SIEM automates much of the workload of a typical SOC team. This surfaces incidents while also reducing false positives, making it easier for a properly staffed NOC to keep an eye on security.
Make training and knowledge development a top priority: Your NOC team must have high-level expertise, specifically in monitoring, managing and resolving issues specific to network performance and your IT infrastructure. Provide robust and frequent training on procedures and protocols for any event, and keep up with changing tech landscape and changes to your own IT environment. Prioritize network performance issues, but don’t overlook procedures for collaborating with your SOC on security issues. A key procedural issue is escalation; make sure your staff knows how and when to make the quick decision to escalate a growing problem to a more experienced teammate.
Define clear roles: Flatter organizational hierarchies are more popular these days. In the fast-paced, must-act-now world of network monitoring, it makes sense to empower each team member rather than rigidly insisting on rank- or role-based handoffs. Yet while technicians should be equipped with the knowledge and authority to act quickly to prevent network failures, you still need escalation tiers and shift supervisors to oversee the NOC.
While NOC technicians should be left largely to do their jobs and offer insight — and certainly they should not be micromanaged — you need a leader who assigns work to technicians based on their skills, prioritizes tasks, prepares reports, ensures incidents are being resolved properly and notifies the broader organization of events as needed. Additionally, each technician should know specifically what tasks will be expected them, their level and the line of reporting should they need to escalate an incident or respond to one.
Enable strong communication: Keeping the lines of communication open — within the NOC, SOC and other external teams — can prove to be challenging. It’s definitely more than just setting up a few periodic meetings. Instead, it takes a concerted effort to train staff as to how and when to share information, and to hold them accountable for doing so. Creating regular opportunities for collaboration and coordination is key to a solid NOC.
Establish clear guidelines and protocols: Keep things running smoothly by creating clear-cut policies for the following:
- Incident management: Document steps the technicians should follow to handle incidents (e.g., when the technician can make the decision, when to escalate the decision, when to notify team members, and so on).
- Solutions: Outline procedures for dealing with common issues and provide immediate methods for dealing with emergencies.
- Escalation: Determine how the team should escalate issues and to whom.
- Prioritization: Establish which incidents take highest priority and which technician level should handle the most important ones. Incidents should be ranked based on the extent they will affect the business.
Having well-established protocols ensures everyone is on the same page, provides consistency across the organization and increases accountability among NOC staff. Of course, having the right people and processes in place lays the foundation, but the actual work can’t be done with the right tools.
Which tool you invest in is largely dependent on your business needs, but your NOC requires a tool or combination of tools that provide the following:
- A comprehensive view of your infrastructure: Whether it’s physical, virtual or in the cloud.
- Automation: To cut down on repetitive tasks, freeing up Level 1 staff to focus on higher-priority issues while also reducing alert fatigue.
- Ticket management: So you can view information related to open tickets, including priority task and the assigned technician, to ensure you resolve internal and external issues quickly.
- Incident reporting: A tool that provides visual analysis, graphical representation of thresholds, alarms, indicators and trends makes it easier to investigate issues and document them for the future.
- A simple interface and deployment: You want to see immediate benefits and not have to battle through a lengthy, complex deployment and a long learning curve.
- Scalability: As your business grows, you want to ensure the NOC can handle it.
The bottom line: NOC yourself out
The network operations center is one of the most important functional teams in IT. Every single day, you have internal and external customers relying on your IT services. You have SLAs (service level agreements) to meet, core business productivity to enable, and your customers’ entire digital experience to maintain.
It’s important to have a NOC capable of preventing catastrophic outages and maximizing uptime of all IT services. Many organizations have a NOC but may struggle to keep it fully staffed, properly trained, and well-equipped with the best tools and automation. Organizations that cannot maintain an effective NOC may have more success with third-party service vendors, also known as managed service providers.