
Baselines are an essential part of effective cybersecurity. They provide a snapshot of normal activity within your network, which enables you to easily identify abnormal or suspicious behavior. Baseline hunting is a proactive approach to threat detection that involves setting up a baseline of normal activity, monitoring that baseline for deviations, and investigating any suspicious activity.
The PEAK Threat Hunting Framework identifies three types of hunts:
- Hypothesis-Driven Hunts
- Model-Assisted Threat Hunts (M-ATH)
- Baseline Hunts
In this article, let's take an in-depth look at baseline hunts, also known as Exploratory Data Analysis (EDA) hunts.
(This article is part of our PEAK Threat Hunting Framework series. Explore the framework to unlock happy hunting!)
When to Perform a Baseline Hunt
Baselining can help you familiarize yourself with new datasets or environments where you've never hunted before. It serves as an excellent precursor to more focused hypothesis-based or model-assisted threat hunting. Before planning and scoping future hunts, it's important to understand the available data sources, their fields, and values. After all, the "K" in PEAK stands for Knowledge!
You can run a baseline hunt at any time, and some situations naturally lend themselves to this type of hunt. For example, when you onboard a new type of security log, baselining that data source will be very helpful to you while you’re trying to figure out how best to use it for detection and response operations.
Another prime baselining opportunity would be when you start hunting in a new environment, such as when you acquire a new company or onboard a new managed security customer. Figuring out what normal activity looks like is a necessary first step in planning any type of monitoring or writing response playbooks.
How to Perform a Baseline Hunt
As with all PEAK hunts, baseline hunts are divided into three major phases: Prepare, Execute, and Act. Let’s examine each of these phases in detail.
The PEAK Baseline Hunting Process
Phase 1. Prepare
All hunts start with the “Prepare” phase. This is where you do all the things necessary to get ready and to ensure a successful hunt. Let’s see what this looks like for a baseline hunt.
Select Data Source
The first step is to decide which data source you’d like to baseline. If you're starting from square one, you should make an effort to baseline all of your critical data sources. Start with the ones your hunt team relies on most, or maybe with the most security-relevant sources. If you’re not sure where to start, prioritize data sources according to their significance to your organization and its detection goals.
Research Data Source
Once you’ve determined which data source you’re going to focus on, you’ll want to become as familiar with it as possible. If this is a common log source that many organizations deal with, such as a Windows event log or events from a common security product, a good starting point might be to find out what the vendor has to say about what’s in the data. You’ll want to:
- Identify the key fields and how to interpret their values.
- Maybe find some specific situations to look out for based on others’ experiences with that type of data.
While you’re doing your research, don’t forget to include any existing monitoring or detection measures implemented for that data, as well as the individuals or teams responsible for the systems or applications creating the data. The former can help focus future hunts, while the latter will be useful if you have questions about the logs or how to interpret them.
Scope Hunt
When conducting a hunt, it's important to narrow your focus, especially in larger environments where analyzing all the data at once may not be possible. Different systems may exhibit different behaviors, so it's helpful to group them based on similarities (such as "user desktops" or "application servers") and baseline each group individually. This approach is easier and more likely to yield better results.
Another important decision is the timeframe for data collection. Baselines are created by analyzing normal activity over a period of time, so it's essential to use enough historical data to establish what's normal. However, it's also crucial to balance having enough data with keeping the window size reasonable to avoid being overwhelmed with too much data to analyze. For most sources, between 30 and 90 days of data will probably be fine.
Plan
Using what you learned from your research, outline the tools, techniques, and resources you'll need to baseline of your data source(s).
- How exactly will you gather the data you need?
- Which analytic techniques will you use to assess what you’re searching for?
- If you have a hunt team (as opposed to an individual hunter), who’s doing what part(s) of the hunt?
Making a good plan helps to ensure the execution phase goes smoothly, so it’s worth spending a little time here.
Phase 2. Execute
With your data sources determined and a plan in place, we move into the “Execute” phase.
Gather Data
Following your hunt plan, it's time to collect the data and bring it all back into one place for analysis. In some cases, this may have already happened (for example, if you’re already ingesting the network logs you need into a Splunk index). In other cases, you might have to identify the specific server(s) and locations on disk from which to collect the data.
As part of the data-gathering process, you may also need to filter your dataset according to the system groups and/or timeline you established while scoping the hunt. Large networks may be generating terabytes of data every day. Sifting through this mountain of information manually, or even with automated systems, can be daunting and time-consuming. By filtering your dataset, your analysis will be more efficient and manageable.
Create Data Dictionary
A data dictionary is a structured repository of information about the data elements used within a data source. It provides a comprehensive description of the fields in the data source, their characteristics, relationships, and usage. Your data dictionary should contain:
- Field names: The names or identifiers of the fields.
- Description: A brief definition of each field and what they are used for.
- Data types: What type of data each field contains (see below).
- Field values: How to interpret the values in each field. In other words, what do they mean?
When it comes to specifying the types of data for each field, here are some of the most common:
- Numerical: This type comprises both continuous and discrete data. Continuous data refers to measurements or observations that can take any value within a range, such as latency measured in milliseconds, which you might use to detect DoS attacks against a network. On the other hand, discrete data represents separate and distinct values, like the count of failed login attempts on a system, which can indicate a brute-force attack.
- Categorical: Falling into either nominal and ordinal data. Nominal data refers to the names of things, such as different types of identified threats in an incident report (‘Malware', 'Phishing', or ‘Exploit Attempt’). Ordinal data, on the other hand, has some sort of implied order, like risk ratings for the incidents (‘Critical', 'High', 'Medium', or 'Low').
- Textual: This data type includes free-form text or strings. An example could be the log messages generated an Intrusion Detection System (IDS), which may contain valuable information for threat hunting. Most syslog messages also fall into this category.
- Date/Time: Date/time data in cybersecurity could be epoch timestamps or ISO format (e.g., "2023-06-14T12:34:56Z"), both indicating the exact time when a specific event occurred. Be sure you also understand what time zone these fields represent!
- Boolean: This simple binary data type in cybersecurity could denote whether a specific security policy is active (‘true’ or ‘false’), whether a particular port on a firewall is available (‘open’ or ‘closed’), or whether certain security functionality is enabled (‘on’ or ‘off’).
Because data sources can often contain many different fields, you don’t necessarily have to document each and every field in order to have a workable data dictionary. Often, just choosing the fields that seem to be most relevant for security is sufficient. For example, in a file transfer log, fields like account names, file names, transfer commands, and statuses might be more useful than file sizes or average transfer rates.
(Take a deeper dive into how to use data dictionaries.)
Review Distributions
In this step, you’ll use descriptive statistics to summarize the values typically found in each of the key fields in your data dictionary. For example, you might compute:
- The average and/or median of numeric values
- The top most common categorical values
- The number of unique values found in that field (AKA the cardinality)
Notice that you’re beginning to define normal behavior. These statistical descriptions are the baseline for normal activity.
Investigate Outliers
Now that you have some idea about what “normal” looks like in your data, you can begin to use your baseline to identify anomalies or outliers that might indicate suspicious activity. There are many techniques for this, but here are a few of the most common:
- Stack counting: Also known as stacking or least frequency of occurrence analysis (LFO), this method involves counting the number of occurrences of each unique value and sorting them in ascending order. The values with the lowest counts are considered outliers. In some cases, this can be reversed, with the values with the highest counts being considered the outliers, but this is relatively rare.
- Z-scores: When dealing with numeric values, a statistical test like z-score can be used. This test looks for values that are ± a certain threshold from the standard deviation. Typically, this threshold is two or three standard deviations.
- Machine learning: For those who want to get fancier, machine learning techniques like isolation forests or density functions can also be used, (though these are out of scope for this article).
For more on outlier detection for threat hunters, watch this talk:
After identifying outliers, you’ll want to investigate each to determine whether they represent security issues or are just benign oddities. It's advisable to seek out correlations or connections between various events or anomalies to uncover any underlying trends or potential security risks.
Gap Analysis
As with most projects involving data, especially new data you’ve never looked at before, things rarely go entirely smoothly. Gap analysis is where you identify challenges you ran into while hunting and, when possible, take action to either resolve or work around them.
Usually, these challenges will be with the data, though in some cases, you might also call out search or analysis tools that didn’t quite work out. For example, you may find that your initial data collection somehow missed data from certain systems. If you can do without those systems, you may elect to just carry on as normal, but if those systems are key to your hunt, you may need to revisit the “Gather Data'' phase in order to collect the additional data.
This step also includes validating and documenting whether all valuable fields and values are parsed and extracted correctly.
Identify Relationships
So far, we’ve looked at the data on a field-by-field basis, but it’s important to understand that any non-trivial dataset is also likely to exhibit relationships between the values in different fields. These relationships can hold critical insights, often providing much more context about the event than you can get just by examining individual data points. A classic example is the count of user logins and how they relate to the time of day, with an increase expected during the start of the typical work shift.
Phase 3. Act
Building on our new foundation of knowledge, we can improve our defensive efforts as well as make future hunting efforts easier and more effective. Time to take some action!
Preserve Hunt
Don't let your hard work disappear. Save your hunt data, including the tools and methods you used, so you can look back at it later or share it with other hunters. Many hunt teams use wiki pages to keep track of each hunt:
- Adding links to the data.
- Outlining how they analyzed it.
- Summarizing the important results.
Often, hunters look back at previous hunts when they face similar situations in the future. Do yourself a favor and make sure to document your hunting process. Your future self will thank you.
Document Baseline
Your baseline consists of the data dictionary, statistical descriptions, and field relationships. Even if you took good notes during the "Execute" phase, it's important to turn those notes into a document that others can understand.
Almost any large dataset will have suspicious-looking but benign anomalies. Don't forget to include a list of these known-benign outliers! Documenting those you already identified and investigated during the “Investigate Outliers” phase will save time during future hunts and incident investigations.
(Make the most of each investigation with these postmortem best practices.)
Create Detections
Since you now have some idea about what “normal” looks like in your data, and you probably also have a little experience investigating some of the outliers, you may be able to distill all of this into some automated detections. Examine each of your key fields or common relationships you identified between fields to see if there are certain values or thresholds that would indicate malicious behavior. If so, consider creating rules to automatically generate alerts for these situations.
This may not always be feasible, so don’t worry if you aren’t able to identify good alerting candidates. Baselines are all about identifying abnormal activity, but just because something is abnormal doesn’t mean it’s malicious. Simply alerting on any abnormal behavior is likely to cause a flood of low-quality alerts. The trick with alerting is to identify outliers that are most likely to signal malicious behavior.
Also, even if anomalies aren’t good candidates for automated alerting, they may still be useful as reports or dashboard items that an analyst can manually review on a regular basis or even as starting points for future hunts.
Communicate Findings
As with all types of hunts, baselines are most impactful only when you share them with relevant stakeholders to improve overall security posture. In addition to sharing with the owners of the system you baselined, you’ll want to be sure that your SOC analysts, incident responders, and detection engineers are aware that the baseline exists and that they have easy access to it. If your security team keeps a documentation wiki or other knowledge repository, that would be a great place to collect all your baselines. You might also consider linking to the baselines from the playbooks that your SOC analysts use to triage alerts.
Conclusion
Because so much of incident detection, response, and threat hunting relies on identifying deviations from normal behavior, good baselines are crucial for any environment or dataset. Baseline hunts let you discover not only what “normal” looks like but also what the expected benign anomalies are – information that hunters, SOC analysts, incident responders, and detection engineers need in order to do their jobs effectively.
Baseline hunts are also valuable precursors to hypothesis-based or model-assisted threat hunting. So take the time to establish good baselines in your environment and improve your ability to detect and respond to potential security threats.
As always, security at Splunk is a family business. Credit to authors and collaborators: David Bianco, Ryan Fetterman