The latest release of the Splunk Machine Learning Toolkit (MLTK) enables users to upload their pre-trained models in MLTK with a simple UI. Once the model is in Splunk, users can use the model with their Splunk data with no modification to their existing workflows. This capability extends the usability of MLTK and ML-SPL beyond models trained using MLTK, unlocking a huge use case of using external models with data inside Splunk. MLTK 5.4.0 is available in GA for both Splunk Cloud Platform and Splunk Enterprise customers.
MLTK is an easy way for Splunk customers to get started with machine learning. The app provides Showcases and Assistants that guide the user through a series of steps to train, assess and operationalize ML models. The app provides backend ML components with a frontend app experience, abstracting away the complexity of data science notebooks and actual code. MLTK empowers users to leverage machine learning in their Splunk workflows using ML-SPL commands - fit for training ML models, and apply for running inference. MLTK is very popular among Splunk customers and serves very important machine learning use cases, such as anomaly detection, forecasting and clustering. It is one of the most downloaded apps on Splunkbase with over 185K downloads. Additionally, the fit and apply commands that are bundled with MLTK are used millions of times every month by our customers.
While customer demand for ML has grown rapidly, many Splunk customers have not been able to incorporate ML as a part of their Splunk user journeys. Most MLTK customers want to bring new algorithms or pre-trained models into MLTK. As per our telemetry data, 80% of algorithms run in Splunk are customized. However, users find it very challenging to create ML models and ship them to use in Splunk with their Splunk data. To use their external models in Splunk, users need to convert the models to MLTK supported codec format and import custom Python scripts with root permissions, which can be a very time-consuming and tedious task. This is a huge pain point for our customers who regularly ask for a better way to solve this issue.
MLTK 5.4 solves the above-mentioned challenges with the option to upload externally-trained ONNX models for inferencing in MLTK. Users can train their models in their preferred third-party environments, save the models in ONNX format, upload the models to MLTK and inference them in MLTK with their Splunk data. This way, users can offload the process-heavy model training outside the Splunk platform but still benefit from the operationalization within the Splunk platform using their Splunk data.
The uploaded model goes through a series of validation steps including validation that the user has the required model upload capabilities, and that the model is the correct file format. After validation and verification, Splunk’s REST API is used to store the model in an MLTK accessible location within Splunk. Users can then use the model with their Splunk data in the same way they would use a model created with MLTK, which is a workflow that users are already familiar with. This way, users can focus on the important task of creating and training their ML models and offload the complexity of bringing their model into Splunk on MLTK.
Users Can Now Leverage ONNX in Splunk
Prior to this release, users were limited to using the ML algorithms and libraries packaged with MLTK. With the 5.4.0 release, however, MLTK supports inferencing ONNX models. ONNX stands for Open Neural Network Exchange and is a common format for machine learning models. This format lets you create models using a variety of machine learning frameworks, tools, runtimes, and compilers. Thus, users can now take advantage of a wider range of libraries for training their models, such as TensorFlow, PyTorch, Keras, Matlab, among others.
Uploading and Inferencing Models Workflow
The UI for uploading an external model is simple and intuitive. It requires a few parameters from the user that are helpful for verification of the model file and are also used during model inference.
Running model inference is the same workflow that MLTK and ML-SPL users are already very familiar with. There is one minor difference - the model name after the apply command needs to have the onnx: prefix. This tells MLTK and the apply command that the model being used for inference is an ONNX model.
User Permissions and Configurations
The MLTK team is very mindful of security concerns and has taken steps to ensure that only users with the appropriate permission to upload model files to Splunk can upload such files. By default, the ability to upload models will be disabled for all users. Splunk admin will need to grant special permission to users to be able to upload model files to Splunk.
More Powerful Anomaly Detection Capabilities
In addition to the pre-trained ONNX model capabilities, MLTK 5.4.0 also extends the anomaly detection capabilities available to users with the addition of a new algorithm for multivariate outlier detection. Users can now provide a multivariate dataset as input to the new MultivariateOutlierDetection algorithm which performs a series of steps internally to return outliers in this dataset.
MLTK 5.4.0 is available today on Splunkbase for use with Splunk Cloud Platform as well as with Splunk Enterprise. For more information on how to use this feature, refer to the MLTK documentation. To get started with this new version today, visit Splunkbase.
- MLTK Documentation
- MLTK YouTube Videos Playlist
- Webinar: Prevent Data Downtime with Machine Learning: How machine learning can be used to improve the ‘getting data in’ experience with Splunk
- Webinar: ML in Security: Risky SPL Detection with MLTK
- Webinar: Operationalized Data Science for Production Optimization: Saving Costs at BMW Group