PLATFORM

A Deeper Dive into Machine Learning at Splunk

A typical bit of feedback I have had during my time at Splunk is that the Splunk Machine Learning Toolkit (MLTK) looks nice and all, but how are we supposed to get started using it? Choosing the right technique, let alone the right algorithm can be a daunting task for those who are unfamiliar with machine learning (ML). 

We’ve been thinking long and hard about how we can help offer more prescriptive introductions into using ML at Splunk and I’m pleased to present our set of MLTK deep dives. These build on some of the content presented at recent webinars, such as the one on how to prevent data downtime with machine learning.

The purpose of the deep dives is to provide end-to-end guides for how to implement specific use cases against your own data in Splunk, so if you are looking for inspiration over where to get started with ML you need not look any further!

MLTK Deep Dives

To begin with, we have launched five deep dives that address use cases that come up frequently with our customers. It won’t come as a surprise to those who are familiar with ML at Splunk that all of these top five use cases are focused on different applications of anomaly detection. If you made your way to .conf22 this year and attended our ML hands-on workshop, this list of use cases may seem pretty familiar too.

 

For each of these examples we describe:

  • The benefit or motivation behind the use case
  • The data sources they relate to
  • The algorithms or techniques that are relevant
  • How to train a simple model to enable the use case
  • How to apply this model to detect situations of interest
  • Fine tuning that you might need to perform to get the use case operational in your environment

For example, in the using ML to detect outliers in error message rates deep dive we walk through the data sources and algorithms that can be used for this, before describing how to implement the model training and model application searches against an example data source with our recommended algorithm (no prizes for guessing that it’s DensityFunction).

You can see some images below of the types of output that this deep dive can help you create to spot anomalous volumes of error rates in your environment.


Out of the Box ML with Enterprise Security Content Update

In addition to these deep dives we have also been working closely with our friends who create analytics for Enterprise Security Content Update (ESCU) to help ship even more prescriptive ML content. With some of their recent releases they have included analytics that provide out-of-the--box ML models for detecting certain types of potentially malicious behaviour, such as detecting use of risky SPL commands. You can read more about these detections here and here

Getting Started

Luckily for you good folks we have just fixed a few bugs and released v5.3.3 of MLTK. The best way to get started trying out all of this content is by grabbing the latest release of MLTK (plus ESCU if you are a Splunk Enterprise Security customer) and to try running through some of our new deep dives on your own data.

If you are keen to find out more about ML at Splunk, please check out some of our beta app releases from .conf22, which are described in more detail here. These also offer super simple introductions as to how you can apply ML techniques on your own data.

Finally, for the eagle-eyed person out there, you may have noticed that the Deep Learning Toolkit has gone through a rebrand; it is now the Splunk App for Data Science and Deep Learning. For more information on this little gem, I’d recommend checking out any of Philipp Drieger’s blogs!

Happy Splunking!

Greg is a Machine Learning Architect at Splunk where he helps customers deliver advanced analytics and uncover new ways of insight from their data. Prior to working at Splunk he spent a number of years with Deloitte and before that BAE Systems Detica working as a data scientist. Before getting a proper job he spent way too long at university collecting degrees in maths including a PhD on “Mathematical Analysis of PWM Processes”. When he is not at work he is usually herding his three young lads around while thinking that work is significantly more relaxing than being at home…