The Exploit Prediction Scoring System (EPSS) Explained

Cybersecurity is complex — anticipating cybersecurity events is another challenge altogether. 

We could argue that most events can be described by some probabilistic phenomenon, but attempting to define that phenomenon is where things get tricky. 

IT environment exposure presents real risks, but mathematically (or statistically), we can only aim to describe the likelihood of a cyberattack by accounting for a finite set of factors. As systems and their behavior become more complex, it also becomes far more challenging to describe their behavior with objective certainty. 

While this could apply to any IT operations process, today we’re talking specifically about the process of addressing vulnerabilities. We can only know the vulnerabilities we know, and we can only guess which ones are likely to present genuine threats to our systems in the near future.

In other words, we’re often being asked to solve the puzzle with only a fraction of the pieces. Thankfully, the sheer scope of this problem has brought cybersecurity professionals together in search of those missing pieces — that’s where EPSS comes in, let’s break it down.

Exploit Prediction Scoring System (EPSS)

Started in 2019, Exploit Prediction Scoring System (EPSS) is an open community-driven effort to model and manage vulnerability risk from a probabilistic perspective. EPSS is governed by the Forum of Incident Response and Security Teams (FIRST), a team responsible for a number of vulnerability scoring protocols.

According to research, businesses and technology vendors fixonly  5-20% of vulnerabilities every month. Yet, only 2-7% of vulnerabilities are ever exploited. But which ones exactly? Since we cannot be sure which vulnerabilities need to be managed first, and since we cannot fix them all immediately, we need to prioritize. 

This is what the EPSS is designed to achieve: a community initiative where each discovered vulnerability gets a probability score 0-1 (0-100%), corresponding to the probability it may be exploited within the next 30 days. EPSS attempts to categorize Common Vulnerabilities and Exposures (CVEs) through aggregation and analysis of prior knowledge.

How EPSS works

EPSS takes data from multiple sources ranging from vendor reports to data published by researchers and white hat hackers. The ground truth, or observed targeting of vulnerabilities, is used to update prior beliefs about the risk and adjust EPSS scores accordingly.

The EPSS model is fairly large, accounting for over 1,100 variables, each containing distinct attributes of vulnerability risk. The Exploit Prediction Scoring System then categorizes vulnerabilities as:

  • False Positives (FP): incorrectly prioritized vulnerabilities with a high EPSS score, but were not exploited according to the observed data.
  • True Positives (TP): correctly prioritized vulnerabilities with a high EPSS score, and were found to be exploited in the real world.
  • False Negatives (FN): incorrectly delayed vulnerabilities that were given a low EPSS score, but were found to be exploited in the observed data.
  • True Negatives (TN): correctly delayed vulnerabilities that were given a low EPSS score, and were also not found to be exploited.

From a macro perspective, EPSS also aims to measure:

  • Efficiency: how efficiently users spent computing resources on discovering and fixing vulnerabilities. Mathematically: TP ÷ (TP+FP).
  • Coverage: the proportion of exploited vulnerabilities that were covered (with a high probability score) by the EPSS model. Mathematically: TP ÷ (TP+FN).

The idea behind using an EPSS model is to use all available knowledge of vulnerabilities in the cybersecurity community, and then devise your risk tolerance levels and vulnerability management activities based on scores that deliver the best efficiency and highest coverage.

What EPSS means for cybersecurity 

EPSS attaches measurable metrics to vulnerability profiles, allowing teams to better address system issues. When analyzing a system, there are plenty of circumstances we may not be readily aware of, which can ultimately sway our response approach. Two major factors drive this uncertainty:

Rapid changes in technology and user base

The user base and requirements to interact and access your technology systems are ever-evolving. 

You cannot know with certainty – or specify exactly – how these interactions will evolve exactly 100 percent of the time. 

For example, the unexpected and sudden lockdown decisions by the government during the Covid-19 pandemic led customers to shop frantically online for essential items in e-commerce stores. This led to a sudden rise in online traffic, which could look like  a DoS attack. At the very least, this traffic overwhelmed the network of small e-commerce stores in some parts of the world. 

Prior to the global pandemic, small e-commerce stores had rarely observed a surge in demand for online shopping or panic buying of essential items, outside of the holiday season.

Unknown vulnerabilities

The second problem is more open-ended: if an access request is incorrectly authorized by the network as a legitimate request, your data assets are only secure as long as they are encrypted. 

In many cases, a vulnerability in the software, inadequate identity and access management and zero-day exploits will allow cybercriminals to bypass your network security defense. And since you do not know about the vulnerability existing within your system, you can only infer the behavior based on prior beliefs about traffic patterns, API requests, user activities and interactions.

EPSS as solution

In essence, to solve the problem of cybersecurity risk management, you need to accurately model the likelihood or probability of a vulnerability or an anomaly, based on prior knowledge of threats facing your organization.

This is exactly what any machine learning-based risk management tool would do. The problem still, and perhaps even beyond the capacity of a machine learning based tool, is to find an accurate prior knowledge of risk and vulnerabilities  — prior knowledge which then guides the utilization of advanced ML tools to focus on the most prevalent security risks instead of simply attempting to fix them all right away.

EPSS’s community-driven approach attempts to do just that: provide a usable repository of historical knowledge in order to defend against future threats.

EPSS vs CVSS: what’s the difference?

If you’re a cybersecurity professional, you’re likely more familiar with the Common Vulnerability Scoring System (CVSS). As the name suggests, CVSS has held its spot as the industry standard and has done so for nearly two decades.

Where EPSS attempts to measure the probability of a vulnerability being used in an exploit, CVSS attempts to assess the severity of a given vulnerability. This means CVSS is concerned with three areas:

  • Base metrics: qualities intrinsic to a vulnerability, including the attack vector, the privileges an attacker will need to utilize the vulnerability and the amount of data at stake
  • Temporal metrics: includes exploit development timelines or patch remediation turnarounds
  • Environmental metrics: properties of the environment, such as security controls

In short, EPSS allows us to prioritize the most pressing vulnerabilities by providing threat actor information and a probabilistic understanding of threats, while CVSS tells us how dangerous a particular vulnerability might be if exploited.

EPSS is the newer methodology and already seems to be outperforming old CVSS models in its emergent state — a product of its focus on vulnerability prioritization extending beyond just incident severity prediction.

Though for that same reason, EPSS being so new, it’s likely that a hybrid approach of CVSS and EPSS is the most appropriate methodology. As technology evolves and these methodologies evolve alongside them, organizations will need to continuously measure their efficacy in stamping out cyberthreats — thankfully, this isn’t a challenge they have to face alone.

What is Splunk?

This posting does not necessarily represent Splunk's position, strategies or opinion.

Muhammad Raza
Posted by

Muhammad Raza

Muhammad Raza is a technology writer who specializes in cybersecurity, software development and machine learning and AI.