Recently we released the Splunk App for Data Science and Deep Learning (DSDL) v5.2.0. This update introduced new features for integrating large language models (LLMs) and retrieval-augmented generation (RAG). With DSDL v5.2.0, users can easily perform LLM prompts, vector searches, RAG, and function calling directly from the app's dashboards. The features come with predefined scripts within the DSDL container, allowing organizations to quickly start using LLM with their own knowledge data and indexed data in Splunk.
However, if analysts want to customize workflows to better suit their needs—such as asking an LLM to select a vector collection and then perform RAG based on that choice—they'll need to modify the Python scripts in the container or create complex SPL commands with multiple Fit commands. This approach can be complex and lacks a simple, visual interface for end users to design and test custom workflows, which can limit the practical application of the LLM-RAG features.
Fortunately, Splunk SOAR stands out as a key player in the field of automation and orchestration, providing an intuitive user interface for designing and testing workflows. In this blog, we will demonstrate how to leverage SOAR as a platform for designing and executing agentic AI workflows using the functionalities offered by DSDL.
The architecture of our proposal is illustrated in the figure below, where agentic workflows are created as SOAR playbooks using basic building blocks called utilities like LLM Prompt, Vector Search and Function Calling. These utilities are powered by custom functions within SOAR, which make FastAPI calls to the DSDL container to perform the respective operations.
In addition to the GenAI utilities, existing SOAR tools can also be integrated into the playbooks, enabling actions based on the LLM's observations and decisions. This integration unlocks vast potential for developing sophisticated agentic AI workflows.
In the rest of this blog, we will delve into the details of creating custom functions and provide a playbook example.
Custom functions on SOAR allow users to define Python functions to be run as utilities, which are a type of basic building unit of a playbook. In this crossover, we’ve implemented four custom functions for agentic AI workflows as follows:
The first three functionalities are supported in DSDL 5.2.0, while for the LLM Decision Making function, we added a new script in the DSDL container. (This script will be built in and available in the next DSDL release). To use these functionalities in SOAR, we implement each custom function as a FastAPI call to the DSDL container for the corresponding script. The input parameters of the custom functions include the input parameters required by the DSDL script, as well as the endpoint and API token of the DSDL container FastAPI.
Below is an example code for the LLM Prompt custom function.
Python def llm_prompt(query=None, model_name=None, api_endpoint=None, api_token=None, llm_service=None, system_prompt=None, **kwargs): ############################ Custom Code Goes Below This Line ################################# import json import phantom.rules as phantom import requests import csv from io import StringIO url = f"{api_endpoint}/fit" headers = { "Authorization": f"Bearer {api_token}", "Content-Type": "application/json" } if system_prompt: system_prompt = system_prompt.replace('"', '').replace('\n', '') else: system_prompt = "You are an expert Q&A system that is trusted around the world. Always answer the query using the provided context information and reasoning as detailed as possible" data = { "data": f"text\n\"{system_prompt}\"", "meta": { "options": { "model_name": "llm_rag_ollama_text_processing", "params": { "algo": "llm_rag_ollama_text_processing", "llm_service": llm_service, "model_name": model_name, "prompt": query } } } } # Send POST request outputs = requests.post(url, headers=headers, json=data, verify=False).json() df_data = outputs['results'] df_data = StringIO(df_data) csv_reader = csv.DictReader(df_data) for row in csv_reader: outputs["llm_response"] = row['Result'] assert json.dumps(outputs) return outputs
The request is sent to the FastAPI endpoint of the DSDL container. The payload includes the name of the script under the key "algo" along with other input parameters. Once the result is returned, it is parsed and assigned to the output variable "llm_response" of the custom function.
The other custom functions are implemented in a similar fashion. All the custom functions are available in this Github repo. The required input parameters as well as the algorithm names can be found in the DSDL 5.2.0 documentation.
NOTE: The parameter "llm_service" used in this example is a newly introduced parameter in the future DSDL release. For the current parameter requirements, please refer to the Fit command in DSDL 5.2.0 Documentation.
Based on the custom functions, we have created an example playbook of an agentic AI workflow with multiple decision points, illustrated in the figure below.
This playbook processes natural language queries and routes them to different agents based on the use case of the request.
It covers three scenarios:
The first LLM Decision Making block of the playbook determines whether the query is related to Splunk (Case 1 or 2) or Buttercup store (Case 3). For Splunk-related queries, the workflow routes them to the left side, where a second LLM Decision Making block identifies whether the query requires real-time data (Case 1) or static knowledge data (Case 2).
Case 1:
If real-time data is needed, the Function Calling block is executed, and the LLM uses Splunk search tools to gather the necessary context. The LLM then generates a final answer based on the outputs from these tools.
Case 2:
When knowledge about Splunk is required, the LLM selects the appropriate vector collection from the following options: splunk_platform_knowledge, splunk_enterprise_security_knowledge and splunk_itsi_knowledge. Each collection contains product-specific knowledge, and a vector search is conducted based on the LLM's chosen collection. The query and vector search results are then sent to the LLM to create a summarized answer.
Case 3:
If the first decision block routes the query to the Buttercup agent, the LLM selects the most relevant vector collection from buttercup_dev_knowledge and buttercup_support_tickets. A vector search is performed, and the LLM answers the query based on the search results.
In this playbook example, the output of each step is recorded in notes associated with the input event on SOAR. Next, let’s explore three examples of how this workflow operates.
Example 1:
In the first example, the input query is: "What indexes are there in my Splunk?"
The results of the workflow execution are illustrated in the figure below. The left side displays the final answer from the LLM, while the right side outlines the steps of the workflow execution.
Based on LLM decisions, the query was routed to the Function Calling block. The list_indexes() tool was executed, and the LLM generated the final answer based on the output from this tool.
Example 2:
In the second example, the input query is: "What CLI commands in Splunk platform show service ports?"
Based on LLM decisions, the query was routed to the Splunk agent and then the knowledge_data context type. The splunk_platform_knowledge collection was then selected and searched against. The LLM generated the final answer based on the output from the vector search.
Example 3:
In the third example, the input query is: "Has there been payment issues and how were they resolved?"
Based on LLM decisions, the query was routed to the Buttercup agent and then the buttercup_support_tickets collection was selected and searched against. The LLM generated the final answer based on the output from the vector search.
The above examples demonstrate how this agentic AI workflow created on SOAR handles different scenarios based on the natural language input and orchestrates various tools to obtain contexts for accurate LLM generations. With SOAR, the creation of the playbook was simple and intuitive, with the capability to test at each step and export for sharing.
In this blog, we explored how SOAR can serve as a platform for designing and executing agentic AI workflows using the functionalities provided by DSDL. Additionally, DSDL can enhance SOAR playbooks by improving decision-making and data enrichment processes. For instance, in security use cases, LLM agents can gather the latest threat intelligence, assess the urgency of incidents, and use SOAR tools to take action. The integration of DSDL and SOAR unlocks a wide range of possibilities of using AI capabilities within your SOAR workflows.
Acknowledgement:
I would like to extend my gratitude to my collaborators, Mitchell Chan and Philipp Drieger, for their contributions to project development and use case discoveries. Special thanks to Hidekazu Fujimori for generously sharing his expertise in SOAR.
The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.
Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.