Splunk enjoys a unique position when it comes to AI and ML. This is due to the fact that any machine learning system is fueled by data. With focus on machine learning, our customers and partners have already caught on and are reporting amazing results across the spectrum, from starting out with their first ML experiments through to fully operationalizing their models into production use cases. By filtering the Splunk .conf session catalogue for MLTK and AI/ML you will uncover an impressive 100+ published sessions and that’s only over the last few years!
The Reason for the Toolkit's creation
Splunk customers and partners are pushing the boundaries of their data-driven use cases every day with more advanced analytics and machine learning. When talking to data scientists, AI/ML engineers or developers, some of the more frequent questions include: “Does Splunk have Neural Networks?”, “Can I leverage GPU computing with Splunk?”, “Can we use Tensorflow or PyTorch with Splunk?”. “We want to do a NLP use case. Can we import SpaCy into Splunk?” And my all-time favorite: “Our Data Scientists use Jupyter Notebooks. Are they compatible with with Splunk?“ Prior to .conf19 the answers to all of these questions were not as straightforward as they are now.
Announcing the Deep Learning Toolkit for Splunk
My colleague Anthony and I had the pleasure to announce the Deep Learning Toolkit for Splunk at this year’s Splunk worldwide user conference. We also had the chance to present use cases on how deep learning approaches can be applied to typical Splunk data sources. Right after after the event, we had the opportunity to attend O’Reilly’s TensorFlow World 2019 with an Ignite Talk and a poster session.
What is the Toolkit about?
The Deep Learning Toolkit for Splunk allows you to integrate advanced custom machine learning systems with the Splunk platform. It extends Splunk’s Machine Learning Toolkit with prebuilt Docker containers for TensorFlow 2.0, PyTorch and a collection of NLP libraries.
By using predefined workflows for rapid development with Jupyter Lab Notebooks the app enables you to build, test (e.g. using TensorBoard) and operationalise your models with Splunk. You can leverage GPUs for compute intense training tasks and flexibly deploy models on CPU or GPU enabled containers. The app ships with various examples that showcase different machine learning tasks like classification, regression, forecasting, clustering and NLP. This allows you to tackle advanced machine learning use cases in Splunk’s main areas of IT Operations, Security, IoT, Business Analytics and beyond.
Example Use Cases
During our presentation, we talked about three example use cases where Deep Learning approaches offer interesting new perspectives for anomaly detection, prediction and clustering. Typically, time series data, for example KPI measurements, contain temporal patterns that can be learned by recurrent neural networks like LSTMs or GRUs or other custom RNN topologies. This is not only interesting for forecasting scenarios but especially for anomaly detection based on deviations comparing the actual data points with their predictions. One example was given by Volkswagen in relation to a predictive maintenance use case.
Another very interesting deep learning method are autoencoders. Those networks allow you to automatically learn features from the input data and retrieve a compressed representation, called the hidden state or encoding. If you measure the deviations of the reconstruction error for newly seen data points you get an interesting perspective into detecting anomalies, e.g. in fraud use cases.
Get started in your environment
So you are curious and motivated to try this app out? Feel free to watch the 10min video introduction to get to know the basic steps, how to setup this app and get started quickly:
Hopefully, this app helps you to accelerate your next-generation AI or ML initiative leveraging Splunk’s Data-To-Everything-Platform and your favourite frameworks or open source libraries. You find many Juypter notebook examples and a predefined workflow that should help you to get started easily!
Connect your own open-source containers
Even if the desired set of ML libraries is not there yet, you can easily extend the app with your custom MLTK Container. Rebuild the existing MLTK Container images or build your own custom images with the open-source repository on GitHub.
Most recently, we finally released the latest version of Deep Learning Toolkit 3.0 which is compatible with Splunk 8.0 and Machine Learning Toolkit 5.0 based on Python 3.