Platform

February 28, 2023

4 Minute Read

Bring More ML to Splunk: Inference Externally Trained ONNX Models in MLTK 5.4.0

By Poonam Yadav

Splunk is committed to using inclusive and unbiased language. This blog post might contain terminology that we no longer use. For more information on our updated terminology and our stance on biased language, please visit our blog post. We appreciate your understanding as we work towards making our community more inclusive for everyone.

The latest release of the Splunk Machine Learning Toolkit (MLTK) enables users to upload their pre-trained models in MLTK with a simple UI. Once the model is in Splunk, users can use the model with their Splunk data with no modification to their existing workflows. This capability extends the usability of MLTK and ML-SPL beyond models trained using MLTK, unlocking a huge use case of using external models with data inside Splunk. MLTK 5.4.0 is available in GA for both Splunk Cloud Platform and Splunk Enterprise customers.

MLTK is an easy way for Splunk customers to get started with machine learning. The app provides Showcases and Assistants that guide the user through a series of steps to train, assess and operationalize ML models. The app provides backend ML components with a frontend app experience, abstracting away the complexity of data science notebooks and actual code. MLTK empowers users to leverage machine learning in their Splunk workflows using ML-SPL commands - fit for training ML models, and apply for running inference. MLTK is very popular among Splunk customers and serves very important machine learning use cases, such as anomaly detection, forecasting and clustering. It is one of the most downloaded apps on Splunkbase with over 185K downloads. Additionally, the fit and apply commands that are bundled with MLTK are used millions of times every month by our customers.

While customer demand for ML has grown rapidly, many Splunk customers have not been able to incorporate ML as a part of their Splunk user journeys. Most MLTK customers want to bring new algorithms or pre-trained models into MLTK. As per our telemetry data, 80% of algorithms run in Splunk are customized. However, users find it very challenging to create ML models and ship them to use in Splunk with their Splunk data. To use their external models in Splunk, users need to convert the models to MLTK supported codec format and import custom Python scripts with root permissions, which can be a very time-consuming and tedious task. This is a huge pain point for our customers who regularly ask for a better way to solve this issue.

MLTK 5.4 solves the above-mentioned challenges with the option to upload externally-trained ONNX models for inferencing in MLTK. Users can train their models in their preferred third-party environments, save the models in ONNX format, upload the models to MLTK and inference them in MLTK with their Splunk data. This way, users can offload the process-heavy model training outside the Splunk platform but still benefit from the operationalization within the Splunk platform using their Splunk data.

The uploaded model goes through a series of validation steps including validation that the user has the required model upload capabilities, and that the model is the correct file format. After validation and verification, Splunk’s REST API is used to store the model in an MLTK accessible location within Splunk. Users can then use the model with their Splunk data in the same way they would use a model created with MLTK, which is a workflow that users are already familiar with. This way, users can focus on the important task of creating and training their ML models and offload the complexity of bringing their model into Splunk on MLTK.

Users Can Now Leverage ONNX in Splunk

Prior to this release, users were limited to using the ML algorithms and libraries packaged with MLTK. With the 5.4.0 release, however, MLTK supports inferencing ONNX models. ONNX stands for Open Neural Network Exchange and is a common format for machine learning models. This format lets you create models using a variety of machine learning frameworks, tools, runtimes, and compilers. Thus, users can now take advantage of a wider range of libraries for training their models, such as TensorFlow, PyTorch, Keras, Matlab, among others.

Uploading and Inferencing Models Workflow

The UI for uploading an external model is simple and intuitive. It requires a few parameters from the user that are helpful for verification of the model file and are also used during model inference.

Running model inference is the same workflow that MLTK and ML-SPL users are already very familiar with. There is one minor difference - the model name after the apply command needs to have the onnx: prefix. This tells MLTK and the apply command that the model being used for inference is an ONNX model.

User Permissions and Configurations

The MLTK team is very mindful of security concerns and has taken steps to ensure that only users with the appropriate permission to upload model files to Splunk can upload such files. By default, the ability to upload models will be disabled for all users. Splunk admin will need to grant special permission to users to be able to upload model files to Splunk.

More Powerful Anomaly Detection Capabilities

In addition to the pre-trained ONNX model capabilities, MLTK 5.4.0 also extends the anomaly detection capabilities available to users with the addition of a new algorithm for multivariate outlier detection. Users can now provide a multivariate dataset as input to the new MultivariateOutlierDetection algorithm which performs a series of steps internally to return outliers in this dataset.

Next Steps

MLTK 5.4.0 is available today on Splunkbase for use with Splunk Cloud Platform as well as with Splunk Enterprise. For more information on how to use this feature, refer to the MLTK documentation. To get started with this new version today, visit Splunkbase.

Resources

Automatic Deprovisioning of users for Okta IdP

With the release of the feature, Splunk customers can automatically deprovision users within Splunk when a user(s) are removed from the customer’s Okta Identity Provider (IdP).

Platform 3 Min Read

A Picture is Worth a Thousand Logs

Splunk can be used to ingest machine-learning service information from services like AWS recognition, what does that look like and how can you set it up?

Platform 3 Min Read

Splunk Data Manager’s Custom Logs: Expanding AWS Log Ingestion Capabilities

Antoni Komorowski shares how Custom Logs in Splunk Data Manager can help improve your log management experience.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.

Learn more about Splunk

Subscribe to our blog

Get the latest articles from Splunk straight to your inbox.

Connect with Splunk on X

Follow @Splunk

Connect with Splunk on Instagram