New: Machine Learning in Splunk Enterprise Security Content Update

Splunk recently released the 4.2 version of the Machine Learning Toolkit (MLTK), featuring a new algorithm—the probability density function. This algorithm is used to determine where values of a data set are expected to fall, based on historical values. It can help you identify anomalous values for a particular data set. The implementation of this algorithm in the MLTK means that we can now leverage machine learning (ML) techniques for identifying outliers in security-related data.

There are many cases in which identifying these anomalies is useful in a security context. Splunk Enterprise Security Content Update (ESCU) contains several searches that look for spikes in various data that may be indicative of malicious activity in your environment. These searches currently use Splunk's computational capabilities to calculate the standard deviation for a set of data points and then look for values that exceed some multiple of that number. While this technique is useful and often sufficient, leveraging the new DensityFunction algorithm in the MLTK provides several advantages (take a look at this blog post on the Splunk Machine Learning Toolkit 4.2 for a deeper dive).

In release 1.0.38 of ESCU, we introduced MLTK versions of three of these searches, which are designed to look for spikes in SMB connections, unusually long command lines on your endpoints, unusually long DNS queries (which could be attributed to activity with machine-generated domain names), or misuse of the DNS protocol for nefarious purposes. Because we need two searches for each MLTK detection—one to build the model and the other to leverage it for detection—this resulted in a total of six new searches.

How the New ML Content Works

The new ML-related content in ESCU takes the form of six searches—three support searches that are used to create the ML models and three detection searches that use the models built by the support searches to look at new data and identify the outliers, relative to historical norms. The new searches are:

The first three searches use the MTLK “fit” command to build a model based on existing data. These searches must be run prior to the corresponding detection search, as the detection searches will fail if the models are not available. Once the models have been built, the detection searches will use the “apply” command to use the model to compare against incoming data and generate a notable event if it identifies an outlier.

Getting It All Up and Running

If you’ve never used ESCU before, go ahead and pull it down from Splunkbase and give it a try. This free subscription service provides you with Analytic Stories—themed security guides loaded with searches designed to help you secure your environment and investigate suspicious activity. It’s a simple install that will give you an interface to explore the content we provide. It’s designed to work with Splunk Enterprise Security, but you can explore the provided searches without it as well.

To use the new searches that leverage the DensityFunction algorithm in MLTK, you’ll need to make sure you have version 4.2 or greater of the MLTK installed on your search heads, in addition to version 1.4 or greater of Python for Scientific Computing (a required dependency). In addition, to use the MLTK commands "fit" and "apply" in ES, you’ll need to visit the “App Imports” configuration and follow the steps outlined here.

That’s it! You can now use the DensityFunction algorithm to hunt for the more subtle "tells" of malicious activity that are otherwise difficult to see.

Finding and Using the ML Content

The new ML-related searches in ESCU are peppered throughout various Analytic Stories and appear next to their original non-MLTK versions. If you’re running ES, the detection searches will show up in Content Management. You can quickly find them by navigating there and typing MLTK in the filter. You can further filter on the ES Content Update app, if needed, as well. From here, you can modify and execute the baseline searches. You can also enable the associated detection searches, which are, by default, scheduled to run every hour.

The searches building the models are set to run over your last 30 days' worth of data. You can edit the search to look over a larger period of time, which, generally speaking, will use more data in the construction of your model and give better results. However, this is not always the case—especially with data that is moving in a macro sense, in which case data over a shorter time frame may be more reflective of today’s “normal." Similarly, you will probably want to periodically rebuild the model to make sure it accurately reflects your current environment. Because these models are built on your data and everyone’s data is a little different, there is no one right answer for how much data to use, or how often to rebuild the model. However, 30 days of data is likely a good starting point, and you can adjust based on your results. If you plan to really dive in and leverage the new algorithm in MLTK, read more on how to use it in Splunk Docs.

Written by: Rico Valdez, Principal Security Researcher

Related Articles

Predicting Cyber Fraud Through Real-World Events: Insights from Domain Registration Trends
Security
12 Minute Read

Predicting Cyber Fraud Through Real-World Events: Insights from Domain Registration Trends

By analyzing new domain registrations around major real-world events, researchers show how fraud campaigns take shape early, helping defenders spot threats before scams surface.
When Your Fraud Detection Tool Doubles as a Wellness Check: The Unexpected Intersection of Security and HR
Security
4 Minute Read

When Your Fraud Detection Tool Doubles as a Wellness Check: The Unexpected Intersection of Security and HR

Behavioral analytics can spot fraud and burnout. With UEBA built into Splunk ES Premier, one data set helps security and HR reduce risk, retain talent, faster.
Splunk Security Content for Threat Detection & Response: November Recap
Security
1 Minute Read

Splunk Security Content for Threat Detection & Response: November Recap

Discover Splunk's November security content updates, featuring enhanced Castle RAT threat detection, UAC bypass analytics, and deeper insights for validating detections on research.splunk.com.
Security Staff Picks To Read This Month, Handpicked by Splunk Experts
Security
2 Minute Read

Security Staff Picks To Read This Month, Handpicked by Splunk Experts

Our Splunk security experts share their favorite reads of the month so you can follow the most interesting, news-worthy, and innovative stories coming from the wide world of cybersecurity.
Behind the Walls: Techniques and Tactics in Castle RAT Client Malware
Security
10 Minute Read

Behind the Walls: Techniques and Tactics in Castle RAT Client Malware

Uncover CastleRAT malware's techniques (TTPs) and learn how to build Splunk detections using MITRE ATT&CK. Protect your network from this advanced RAT.
AI for Humans: A Beginner’s Field Guide
Security
12 Minute Read

AI for Humans: A Beginner’s Field Guide

Unlock AI with the our beginner's field guide. Demystify LLMs, Generative AI, and Agentic AI, exploring their evolution and critical cybersecurity applications.
Splunk Security Content for Threat Detection & Response: November 2025 Update
Security
5 Minute Read

Splunk Security Content for Threat Detection & Response: November 2025 Update

Learn about the latest security content from Splunk.
Operation Defend the North: What High-Pressure Cyber Exercises Teach Us About Resilience and How OneCisco Elevates It
Security
3 Minute Read

Operation Defend the North: What High-Pressure Cyber Exercises Teach Us About Resilience and How OneCisco Elevates It

The OneCisco approach is not about any single platform or toolset; it's about fusing visibility, analytics, and automation into a shared source of operational truth so that teams can act decisively, even in the fog of crisis.
Data Fit for a Sovereign: How to Consider Sovereignty in Your Digital Resilience Strategy
Security
5 Minute Read

Data Fit for a Sovereign: How to Consider Sovereignty in Your Digital Resilience Strategy

Explore how digital sovereignty shapes resilient strategies for European organisations. Learn how to balance control, compliance, and agility in your data infrastructure with Cisco and Splunk’s flexible, secure solutions for the AI era.