This blog post focuses on employing a local LLM (llama3:8B) via Ollama for optimization. We aim to refine it using RAG (Retrieval-Augmented Generation), a technique that augments language model outputs by gathering data from external sources prior to response generation. This information will subsequently be employed alongside Splunk MLTK machine learning functions and the latest AI prompt features, enhancing the precision of Splunk detection results.

The following are the components of this research:
- Model Llama 3 8B: A large language model developed by Meta with 8 billion parameters, designed for efficiency and high-quality natural language understanding and generation tasks.
- Ollama: A platform that enables users to easily run, manage, and interact with open-source large language models locally on their own computers, offering a streamlined experience for deploying and using models like Llama and others without requiring cloud resources.
- Splunk Enterprise 9.3.5: A platform for searching, monitoring, and analyzing machine-generated data (logs and events) in real time.
- Splunk MLTK: An app for Splunk Enterprise that provides tools and guided workflows for building, testing, and operationalizing machine learning models on your Splunk data. Version 5.6.0 includes improvements, security updates, and bug fixes to enhance user experience and compatibility with newer Splunk versions. It enables users to apply algorithms, create custom models, and integrate machine learning into their Splunk searches and dashboards without requiring deep programming knowledge. This new version allows connection and integration with LLMs including OpenAI and Ollama.
- Splunk ESCU: An app and content package for Splunk Enterprise Security that provides a regularly updated collection of security detections, analytic stories, and response playbooks mapped to frameworks like MITRE ATT&CK. ESCU enables security teams to quickly deploy up-to-date detection logic, threat hunting searches, and automated response actions, helping organizations stay current with emerging threats and improve their security posture.
- Splunk ESCU Attack Data: A collection of datasets used during development and testing of detections present in Splunk ESCU.
ESCU Llama3 RAG System
Overview
This project implements a Retrieval-Augmented Generation (RAG) system designed for ESCU (Enterprise Security Content Update) data using local LLaMA3:8B Inference. The system provides cybersecurity analysis by combining real attack data with AI-powered responses.
Core Functionality
- Loads ESCU Data: Parses MITRE ATT&CK techniques from attack_data-master repository
- Extracts Attack Scenarios: Identifies real attack scenarios (Cobalt Strike, Sliver, TrickBot, ransomware, etc.)
- Processes Security Content: Integrates Splunk security detection rules
- Provides AI Analysis: Uses local LLaMA3 to analyze cybersecurity queries with ESCU context
- Maintains Privacy: Runs completely offline with local AI inference
Data Sources
- Attack Data: Real attack datasets with MITRE ATT&CK mapping
- Security Content: Splunk detection rules and security content
- Log Samples: Actual attack logs (Sysmon, PowerShell, Security logs)

RAG Components
Data Indexer (ESCUDataIndexer)
Input: Raw ESCU directories and files
- Processing:
- Maps MITRE technique IDs (T1055) to names (Process Injection)
- Extracts attack scenarios from subdirectories
- Classifies log types and sources
- Loads security detection rules
- Output: Structured knowledge base
Context Generator
Input: User query + indexed ESCU data
- Processing:
- Semantic matching of queries to techniques
- Retrieval of relevant attack scenarios
- Selection of applicable detection rules
- Output: Contextual prompt for LLaMA3
LLaMA3 Inference Engine
- Input: Context-enhanced prompt
- Processing: Local AI inference using LLaMA3:8B
- Output: Cybersecurity analysis and recommendations
Based on the above elements the following improvements were sought:
ESCU-Specific Data Understanding
- Before: Generic JSON/YAML parsing with poor recognition
- After: Purpose-built ESCU structure parser that understands:
- MITRE ATT&CK technique directories
- Attack scenario subdirectories
- Log file classification
- Security content integration
Robust Error Handling
- Multiple Format Support: JSON, JSONL, YAML, malformed files
- Large File Management: Automatic skipping of files >50MB
- Graceful Degradation: Continues processing despite individual file failures
- Encoding Resilience: Handles various text encodings
Intelligent Context Generation
- Semantic Matching: Matches user queries to relevant techniques
- Multi-Source Context: Combines techniques, detections, and log samples
- Relevance Ranking: Prioritizes most applicable ESCU content
- Context Optimization: Balances comprehensiveness with token limits
Local AI Integration
- Multi-Backend Support: Ollama, Text Generation WebUI(OpenWebUI)
- Auto-Detection: Automatically finds available LLaMA3 instances
- Fallback System: Provides structured responses when AI unavailable
- Privacy-First: No data leaves your local environment
Performance Optimizations
- Response Caching: Instant responses for repeated queries
- Lazy Loading: Only processes files when needed
- Memory Management: Efficient handling of large datasets
- Incremental Processing: Processes data in manageable segments
Llama 3.8B ESCU RAG metrics
Data Loading Performance
- Technique Directories: ~150 MITRE techniques processed
- Attack Scenarios: ~300+ real attack scenarios indexed
- Log Samples: ~500+ attack log files catalogued
- Processing Time: 30-60 seconds for full dataset
- Memory Usage: <2GB for complete ESCU dataset
Query Response Times
- Cached Queries: <1 second (instant)
- New Queries: 5-15 seconds (with LLaMA3)
- Fallback Mode: <2 seconds (structured response)
- Context Generation: <1 second (retrieval)
Accuracy Improvements
- Technique Recognition: 95%+ accuracy for MITRE techniques
- Scenario Matching: 90%+ relevant scenario identification
- Detection Mapping: 85%+ appropriate detection suggestion
Once the process was finished a new model file was created and saved under Ollama in order to be able to run it from MLTK or from Ollama WebOpenUi if desired.
Based on the above items I proceeded to test some detection techniques, some from ESCU and some MLTK algorithms. I used Splunk Boss of the SOC (BOTS) datasets. The following are some examples of the performed queries and the use of the AI feature from MLTK.
Unusual SSH login - Botsv3


AWS Brute Force Botsv3


Splunk MLTK,Clustering, and Group Similar Attack Patterns Botsv3
In this example I used CISCO ASA data along with Splunk MLTK and the trained Llama3:8b (ESCU refined LLM). In the following detection I used KMeans, which is an unsupervised machine learning algorithm that groups data points into a specified number of clusters based on their similarity. In this case we are looking for similar attack patterns.

In the following example I used DBSCAN. DBSCAN is an unsupervised machine learning algorithm that clusters data points based on density, identifying groups of closely packed points and marking outliers as noise. In this specific example I used windows logs targeting command line lengths.


Final Notes
The use of local Llms can be enhanced via RAG and applied to detection development and analysis. I can tell you that the bigger the model the more accurate and powerful. Initially, I trained Llama4 (quantized) and the results were much more effective specially the inference from the detections.
Unfortunately, due to hardware limitations the connection with MLTK would timeout which forced me to downgrade to Llama3:8B; but if hardware limitation is not an issue, this can be done in the enterprise with local LLMs and definitely enhance the performance and analysis of these detections. I invite you to think of the number of use cases and applications that can be developed now so that we can put Splunk, MLTK, LLMs, and ESCU together.