October 31, 2025

8 Minute Read

Introducing the Splunk Technology Add-on for Ollama: Illuminating Shadow AI Deployments

By Rod Soto, Splunk Threat Research Team

Without strong visibility and governance, local LLMs risk replicating the fragmented, unsupervised sprawl once seen in shadow IT, complicating security postures and making it difficult for organizations to ensure proper oversight and compliance as these powerful AI tools become embedded in daily workflows. To address this challenge, The Splunk Threat Research Team has released the Splunk Technology Add-on for Ollama that provides comprehensive monitoring and observability capabilities specifically designed for local LLM deployments. This purpose-built Technology Add-on empowers security and IT operations teams to gain unprecedented visibility into Ollama usage patterns, resource consumption, and deployment locations across their enterprise environments.

The rapid rise of LLMs has driven organizations and individuals to adopt AI models on their own hardware rather than relying on cloud infrastructure. This surge has been driven by the promise of improved privacy, lower operational costs, and most of all direct control over sensitive data. However, this decentralization introduces new challenges: local LLM deployments are often invisible to centralized IT management and security tools, fostering the growth of “shadow AI”; the untracked, privately run AI systems that can expose organizations to data leaks, compliance violations and operational blind spots.

For enterprises, local LLMs deliver a compelling mix of privacy, cost efficiency, and customization that challenges cloud-based AI services. Sensitive data never leaves the organization’s infrastructure, allowing companies to maintain full compliance with regulations. Control over every aspect of deployment means IT and security teams can fine-tune models, tailor workflows, and enforce organization specific safeguards, while developers benefit from rapid prototyping and the freedom to experiment with open-source models.

The Rise of Ollama

Ollama’s rise from its launch in 2023 to lead in the local AI ecosystem is remarkable. By 2025, the platform has a library of 1,700 models, enterprise-grade performance with advanced quantization and support for secure, private deployments making it the go to choice for local LLM deployment. Ollama runs on Windows, Linux, MacOS, and it also features integrations in major development toolkits and environments such as docker.

Ollama by the Numbers

Over 140,000 Github stars and 11,500 forks with more than 400 developers contributing.
A library featuring over 1,700 local LLMs, including top models such as Llama 3, Qwen2.5, Mistral and DeepSeek.
Estimates of 200,000 to 270,000 Ollama instances deployed globally, the majority running on private infrastructure.

Visibility Challenges: The Double-Edged Sword of Local AI

The visibility of local AI deployments, especially platforms like Ollama also presents security challenges for organizations. By design, most Ollama instances are deployed on private infrastructure, there is no central registry or authoritative inventory for locally hosted LLMs; as a result, IT teams lack complete visibility into where models are running, what data they have access to, and how securely they are managed.

Security researchers examining the DeepSeek AI model uncovered concealed backdoors, highlighting potential risks when deploying local LLMs. Users who run DeepSeek locally through Ollama must handle their own security oversight, though downloaded models typically don't send information to external sources.

There is also the risk of exposing Ollama servers to the internet using public or private cloud infrastructure as Cisco Talos found 1,100 Ollama servers exposed to the internet. In this research study Cisco Talos found threat risks associated with exposed local LLMs, such as

Unauthorized API Access.
Model extraction attacks.
Jailbreaking and Content Abuse.
Resource Hijacking.
Backdoor Injection and Model Poisoning.

The Advent of LLM Enhanced Payloads

The advent of LLM-enhanced payloads like LameHug and PromptLock marks a turning point in offensive security as attackers leverage LLMs capabilities to dynamically craft, mutate and obfuscate malware or exploitation scripts in real time. The proliferation of local LLMs running models like Llama or GPT-OSS locally enables attackers to leverage them as part of their arsenal.

The unchecked proliferation of shadow AI leads to risks that need to be addressed, as there is a critical need for monitoring and continuous auditing.

Enter the Ollama Technology Add-on

The Ollama Technology Add-on directly addresses the operational visibility and security challenges inherent in local LLM deployments. By enabling automated ingestion of Ollama logs, this add-on allows organizations to centrally monitor local LLM usage.

The Ollama TA is cim v5 compatible and allows local monitoring and collection via Splunk’s HTTP Event Collector. If Prompt logs are captured from early versions of Ollama they will also be parsed. In recent versions Ollama no longer logs individual prompts and responses in server logs. This reflects an intentional shift for privacy and security reasons.

While detailed access and system logs are still available, organizations seeking prompt-level auditing must implement custom logging at the application layer, as native prompt logging is no longer supported in the standard Ollama.

Based on what we can collect from Ollama we still can address local LLM threat risks using Splunk now that we can get very detailed logs from Ollama deployments.

How to Implement

The flow of implementation begins with simply installing the TA, setting the monitoring directories like in the following example or Splunk UF or sending the logs via Splunk HEC, then testing the CIM mapped fields.

As seen in the following screenshots, once you upload the TA file via Apps management and using Install App From File, you will see the following.

Once the TA is installed you can proceed to either upload the logs via add data, monitor directory locally or via UF or simply setup HEC. In the following examples we are going to use monitoring of directories where these log files are placed.

The following is an example of inputs.conf $SPLUNK_HOME\etc\apps\Ta-ollama-releasev1\default\inputs.conf

Browse to the directory location as seen below

Once you select the directory continue on to either select or create index and remember that for Ollama logs obtained via directory monitoring locally or via UF the sourcetype is ollama:server . For HEC the sourcetype is ollama:api.

Once you set the source type and index simply review and continue. The we can simply test it by doing the following query

index=main sourcetype=ollama:server
| stats count by src, dest, http_method, status, url, protocol

Once you see the results as above, you are ready to go. Remember in this very early version of Ollama we are improving mapping to CIM, however this improvement is a work in progress and due to challenges faced with logs from GIN (Golang Web framework used in Ollama) there are lot positional ambiguity issues with fields values, so we will need to use some regex in some instances as we continue to improve this TA.

Sample of Detections

The following are examples of ESCU detections targeting Ollama Suspicious activities.

Ollama Abnormal Network Connectivity

Detects abnormal network activity and connectivity issues in Ollama including non-localhost API access attempts and warning-level network errors, such as DNS lookup failures, TCP connection issues, or host resolution problems.

SPL

index=ollamatav013 level=WARN (msg="*failed*" OR msg="*dial tcp*" OR msg="*lookup*" OR msg="*no such host*" OR msg="*connection*" OR msg="*network*" OR msg="*timeout*" OR msg="*unreachable*" OR msg="*refused*")
| eval src=coalesce(src, src_ip, "N/A")
| stats count as incidents, values(src) as src, values(msg) as warning_messages, latest(_time) as last_incident by host
| eval last_incident=strftime(last_incident, "%Y-%m-%d %H:%M:%S")
| eval severity="medium"
| eval attack_type="Abnormal Network Connectivity"
| stats count by last_incident, host, incidents, src, warning_messages, severity, attack_type

Ollama Possible API Endpoint Scan Reconnaissance

Detects API reconnaissance activity against Ollama servers by identifying sources probing multiple API endpoints within a short frame of time particularly when using HEAD (very atypical) requests or accessing diverse endpoints paths, through different request types (GET, POST). In the following example we are looking at a clear surge of API requests, in this specific example a LLM attack tool was used (Promptfoo) to generate this data.

As we start delving into the patterns of use of these local LLM frameworks it is important to understand that these detections will have to be adjusted periodically as we begin to ingest, analyze and understand usage patterns.

SPL

index=ollamatav013 "[GIN]" 
| bin _time span=5m
| stats count as total_requests, values(dest) as dest, values(http_method) as methods, values(status) as status_codes by _time, src, host
| where total_requests > 120
| eval severity="medium"
| eval attack_type="API Activity Surge"
| stats count by _time, host, src, total_requests, dest, methods, status_codes, severity, attack_type

Suspicious Prompt or Jailbreak Attempt

Detects potential prompt injection or jailbreak attempts against Ollama API endpoints by identifying requests with abnormally long response times. Attackers often craft complex, layered prompts designed to bypass AI safety controls, which typically result in extended processing times as the model attempts to parse and respond to these malicious inputs. This detection monitors /api/generate and /api/chat endpoints for requests exceeding 30 seconds.

SPL

index=ollamatav013 "GIN" ("*/api/generate*" OR "*/v1/chat/completions*")
| rex field=_raw "\|\s+(?\d+)\s+\|\s+(?[\d\.]+[a-z]+)\s+\|\s+(?[\:\da-f\.]+)\s+\|\s+(?\w+)\s+\"(?[^\"]+)\""
| rex field=response_time "^(?:(?\d+)m)?(?[\d\.]+)s$"
| eval response_time_seconds=if(isnotnull(minutes), tonumber(minutes)*60+tonumber(seconds), tonumber(seconds))
| eval src=src_ip
| where response_time_seconds > 30
| bin _time span=10m
| stats count as long_request_count, 
        avg(response_time_seconds) as avg_response_time, 
        max(response_time_seconds) as max_response_time,
        values(uri_path) as uri_path,
        values(status_code) as status_codes
        by _time, src, host
| where long_request_count > 170
| eval avg_response_time=round(avg_response_time, 2)
| eval max_response_time=round(max_response_time, 2)
| eval severity=case(
    long_request_count > 50 OR max_response_time > 55, "critical",
    long_request_count > 20 OR max_response_time > 40, "high",
    1=1, "medium"
)
| eval attack_type="Potential Prompt Injection / Jailbreak"
| table _time, host, src, uri_path, long_request_count, avg_response_time, max_response_time, status_codes, severity, attack_type

In the above search we look at prompts that took over 30 seconds to respond and long requests in terms of time, suggesting longer processing times for extended possible malicious prompts, in this case because the way the data was generated (Using Promptfoo) we could accurately pinpoint the length of response timing against normal prompts, however this can be variable per type of model, infrastructure.

There is also another obstacle when looking at prompt content in general, that is the fact that the industry has chosen to classify prompts and responses alike email body. That means the access to them is protected and only allowed to privileged individuals or through the use of specific applications.

Also this search contains Regex which addresses the complexity and positional ambiguity of GIN generated logs. As of the launch of this TA there are still challenges specially when crafting detections and CIM compliance that can only be achieved by using Regex, which is necessary in this case to transform GIN raw logs into key indicators (IP Address, response times, endpoints).

Integration

Now that we can start ingesting and analyzing the information from Ollama logs we can use it as well with other defense technologies that can leverage attack techniques, tools and procedures such as MITRE ATT&CK and ATLAS, as these detections are also mapped to TTPs with the aim of helping analysts detect possible attacks.

Based on indicators such as abnormal crashes, resource exhaustion or prompt response length anomalies, analysts can also analyze model hashes to determine their legitimacy or possible forgery or backdoors.

EDR telemetry products may as well correlate possible abnormalities in process behavior, privilege escalation, code injection and lateral movement from compromised LLM instances. Products such as SNORT can be used to detect API exploitation attempts against Ollama specially if that instance is exposed to the internet.

Finally, now that we can obtain Ollama logs, Splunk SOAR can also be used to isolate compromised instances, revoke API keys, or capture forensic data and notify security teams.

Conclusion

Ollama has become very popular with local AI deployment, enabling developers and organizations to run sophisticated language models on their own infrastructure. However, as more teams adopt local LLMs, the need for comprehensive monitoring is critical. Although Ollama's local-first design provides some confidentiality and security within company networks, it simultaneously creates challenges around shadow AI—where untracked local LLM deployments emerge that are hard to oversee and protect.

The Splunk Technology Add-on for Ollama addresses these challenges, providing operational telemetry, log analytics, and prompt metadata monitoring to help organizations address shadow AI.

For all our tools and security content, please visit research.splunk.com.

Rod Soto

Worked at Prolexic, Akamai, Caspida. Won BlackHat CTF in 2012. Co-founded Hackmiami, Pacific Hackers meetup and conferences.

Splunk Threat Research Team

The Splunk Threat Research Team is an active part of a customer’s overall defense strategy by enhancing Splunk security offerings with verified research and security content such as use cases, detection searches, and playbooks. We help security teams around the globe strengthen operations by providing tactical guidance and insights to detect, investigate and respond against the latest threats. The Splunk Threat Research Team focuses on understanding how threats, actors, and vulnerabilities work, and the team replicates attacks which are stored as datasets in the Attack Data repository.

Our goal is to provide security teams with research they can leverage in their day to day operations and to become the industry standard for SIEM detections. We are a team of industry-recognized experts who are encouraged to improve the security industry by sharing our work with the community via conference talks, open-sourcing projects, and writing white papers or blogs. You will also find us presenting our research at conferences such as Defcon, Blackhat, RSA, and many more.

Splunk AI: Catalyzing Digital Resilience in Cybersecurity and Observability

At .conf23, we released a wide range of new and improved AI functionality to our portfolio starting with our innovations in the Splunk Platform, all of which are available on Splunkbase today.

Artificial Intelligence 8 Min Read

Getting Started With Copilot Log Analysis for Security in Microsoft 365 With Splunk

Learn M365 Copilot log analysis, detect AI-specific threats like prompt injection, and leverage Splunk for robust security monitoring & compliance.

Artificial Intelligence 9 Min Read

Using Splunk to Monitor the Security of MCP Servers

Learn how to use Splunk to monitor MCP Server security.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.

Learn more about Splunk

Subscribe to our blog

Get the latest articles from Splunk straight to your inbox.

Connect with Splunk on X

Follow @Splunk

Connect with Splunk on Instagram

Follow @Splunk

See Splunk Perspectives blog for execs

Get Perspectives

Introducing the Splunk Technology Add-on for Ollama: Illuminating Shadow AI Deployments

The Rise of Ollama

Ollama by the Numbers

Visibility Challenges: The Double-Edged Sword of Local AI

The Advent of LLM Enhanced Payloads

Enter the Ollama Technology Add-on

How to Implement

Sample of Detections

Ollama Abnormal Network Connectivity

Ollama Possible API Endpoint Scan Reconnaissance

Suspicious Prompt or Jailbreak Attempt

Integration

Conclusion

Related Articles

Splunk AI: Catalyzing Digital Resilience in Cybersecurity and Observability

Getting Started With Copilot Log Analysis for Security in Microsoft 365 With Splunk

Using Splunk to Monitor the Security of MCP Servers

About Splunk

Subscribe to our blog

Connect with Splunk on X

Connect with Splunk on Instagram

See Splunk Perspectives blog for execs