Without strong visibility and governance, local LLMs risk replicating the fragmented, unsupervised sprawl once seen in shadow IT, complicating security postures and making it difficult for organizations to ensure proper oversight and compliance as these powerful AI tools become embedded in daily workflows. To address this challenge, The Splunk Threat Research Team has released the Splunk Technology Add-on for Ollama that provides comprehensive monitoring and observability capabilities specifically designed for local LLM deployments. This purpose-built Technology Add-on empowers security and IT operations teams to gain unprecedented visibility into Ollama usage patterns, resource consumption, and deployment locations across their enterprise environments.
The rapid rise of LLMs has driven organizations and individuals to adopt AI models on their own hardware rather than relying on cloud infrastructure. This surge has been driven by the promise of improved privacy, lower operational costs, and most of all direct control over sensitive data. However, this decentralization introduces new challenges: local LLM deployments are often invisible to centralized IT management and security tools, fostering the growth of “shadow AI”; the untracked, privately run AI systems that can expose organizations to data leaks, compliance violations and operational blind spots.
For enterprises, local LLMs deliver a compelling mix of privacy, cost efficiency, and customization that challenges cloud-based AI services. Sensitive data never leaves the organization’s infrastructure, allowing companies to maintain full compliance with regulations. Control over every aspect of deployment means IT and security teams can fine-tune models, tailor workflows, and enforce organization specific safeguards, while developers benefit from rapid prototyping and the freedom to experiment with open-source models.
Ollama’s rise from its launch in 2023 to lead in the local AI ecosystem is remarkable. By 2025, the platform has a library of 1,700 models, enterprise-grade performance with advanced quantization and support for secure, private deployments making it the go to choice for local LLM deployment. Ollama runs on Windows, Linux, MacOS, and it also features integrations in major development toolkits and environments such as docker.

The visibility of local AI deployments, especially platforms like Ollama also presents security challenges for organizations. By design, most Ollama instances are deployed on private infrastructure, there is no central registry or authoritative inventory for locally hosted LLMs; as a result, IT teams lack complete visibility into where models are running, what data they have access to, and how securely they are managed.
Security researchers examining the DeepSeek AI model uncovered concealed backdoors, highlighting potential risks when deploying local LLMs. Users who run DeepSeek locally through Ollama must handle their own security oversight, though downloaded models typically don't send information to external sources.
There is also the risk of exposing Ollama servers to the internet using public or private cloud infrastructure as Cisco Talos found 1,100 Ollama servers exposed to the internet. In this research study Cisco Talos found threat risks associated with exposed local LLMs, such as
The advent of LLM-enhanced payloads like LameHug and PromptLock marks a turning point in offensive security as attackers leverage LLMs capabilities to dynamically craft, mutate and obfuscate malware or exploitation scripts in real time. The proliferation of local LLMs running models like Llama or GPT-OSS locally enables attackers to leverage them as part of their arsenal.

The unchecked proliferation of shadow AI leads to risks that need to be addressed, as there is a critical need for monitoring and continuous auditing.
The Ollama Technology Add-on directly addresses the operational visibility and security challenges inherent in local LLM deployments. By enabling automated ingestion of Ollama logs, this add-on allows organizations to centrally monitor local LLM usage.

The Ollama TA is cim v5 compatible and allows local monitoring and collection via Splunk’s HTTP Event Collector. If Prompt logs are captured from early versions of Ollama they will also be parsed. In recent versions Ollama no longer logs individual prompts and responses in server logs. This reflects an intentional shift for privacy and security reasons.
While detailed access and system logs are still available, organizations seeking prompt-level auditing must implement custom logging at the application layer, as native prompt logging is no longer supported in the standard Ollama.
Based on what we can collect from Ollama we still can address local LLM threat risks using Splunk now that we can get very detailed logs from Ollama deployments.

The flow of implementation begins with simply installing the TA, setting the monitoring directories like in the following example or Splunk UF or sending the logs via Splunk HEC, then testing the CIM mapped fields.

As seen in the following screenshots, once you upload the TA file via Apps management and using Install App From File, you will see the following.

Once the TA is installed you can proceed to either upload the logs via add data, monitor directory locally or via UF or simply setup HEC. In the following examples we are going to use monitoring of directories where these log files are placed.

The following is an example of inputs.conf $SPLUNK_HOME\etc\apps\Ta-ollama-releasev1\default\inputs.conf

Browse to the directory location as seen below

Once you select the directory continue on to either select or create index and remember that for Ollama logs obtained via directory monitoring locally or via UF the sourcetype is ollama:server . For HEC the sourcetype is ollama:api.

Once you set the source type and index simply review and continue. The we can simply test it by doing the following query
index=main sourcetype=ollama:server | stats count by src, dest, http_method, status, url, protocol

Once you see the results as above, you are ready to go. Remember in this very early version of Ollama we are improving mapping to CIM, however this improvement is a work in progress and due to challenges faced with logs from GIN (Golang Web framework used in Ollama) there are lot positional ambiguity issues with fields values, so we will need to use some regex in some instances as we continue to improve this TA.
The following are examples of ESCU detections targeting Ollama Suspicious activities.
Detects abnormal network activity and connectivity issues in Ollama including non-localhost API access attempts and warning-level network errors, such as DNS lookup failures, TCP connection issues, or host resolution problems.

SPL
index=ollamatav013 level=WARN (msg="*failed*" OR msg="*dial tcp*" OR msg="*lookup*" OR msg="*no such host*" OR msg="*connection*" OR msg="*network*" OR msg="*timeout*" OR msg="*unreachable*" OR msg="*refused*") | eval src=coalesce(src, src_ip, "N/A") | stats count as incidents, values(src) as src, values(msg) as warning_messages, latest(_time) as last_incident by host | eval last_incident=strftime(last_incident, "%Y-%m-%d %H:%M:%S") | eval severity="medium" | eval attack_type="Abnormal Network Connectivity" | stats count by last_incident, host, incidents, src, warning_messages, severity, attack_type
Detects API reconnaissance activity against Ollama servers by identifying sources probing multiple API endpoints within a short frame of time particularly when using HEAD (very atypical) requests or accessing diverse endpoints paths, through different request types (GET, POST). In the following example we are looking at a clear surge of API requests, in this specific example a LLM attack tool was used (Promptfoo) to generate this data.

As we start delving into the patterns of use of these local LLM frameworks it is important to understand that these detections will have to be adjusted periodically as we begin to ingest, analyze and understand usage patterns.
SPL
index=ollamatav013 "[GIN]" | bin _time span=5m | stats count as total_requests, values(dest) as dest, values(http_method) as methods, values(status) as status_codes by _time, src, host | where total_requests > 120 | eval severity="medium" | eval attack_type="API Activity Surge" | stats count by _time, host, src, total_requests, dest, methods, status_codes, severity, attack_type
Detects potential prompt injection or jailbreak attempts against Ollama API endpoints by identifying requests with abnormally long response times. Attackers often craft complex, layered prompts designed to bypass AI safety controls, which typically result in extended processing times as the model attempts to parse and respond to these malicious inputs. This detection monitors /api/generate and /api/chat endpoints for requests exceeding 30 seconds.
SPL
index=ollamatav013 "GIN" ("*/api/generate*" OR "*/v1/chat/completions*")
| rex field=_raw "\|\s+(?\d+)\s+\|\s+(?[\d\.]+[a-z]+)\s+\|\s+(?[\:\da-f\.]+)\s+\|\s+(?\w+)\s+\"(?[^\"]+)\""
| rex field=response_time "^(?:(?\d+)m)?(?[\d\.]+)s$"
| eval response_time_seconds=if(isnotnull(minutes), tonumber(minutes)*60+tonumber(seconds), tonumber(seconds))
| eval src=src_ip
| where response_time_seconds > 30
| bin _time span=10m
| stats count as long_request_count,
avg(response_time_seconds) as avg_response_time,
max(response_time_seconds) as max_response_time,
values(uri_path) as uri_path,
values(status_code) as status_codes
by _time, src, host
| where long_request_count > 170
| eval avg_response_time=round(avg_response_time, 2)
| eval max_response_time=round(max_response_time, 2)
| eval severity=case(
long_request_count > 50 OR max_response_time > 55, "critical",
long_request_count > 20 OR max_response_time > 40, "high",
1=1, "medium"
)
| eval attack_type="Potential Prompt Injection / Jailbreak"
| table _time, host, src, uri_path, long_request_count, avg_response_time, max_response_time, status_codes, severity, attack_type
In the above search we look at prompts that took over 30 seconds to respond and long requests in terms of time, suggesting longer processing times for extended possible malicious prompts, in this case because the way the data was generated (Using Promptfoo) we could accurately pinpoint the length of response timing against normal prompts, however this can be variable per type of model, infrastructure.
There is also another obstacle when looking at prompt content in general, that is the fact that the industry has chosen to classify prompts and responses alike email body. That means the access to them is protected and only allowed to privileged individuals or through the use of specific applications.
Also this search contains Regex which addresses the complexity and positional ambiguity of GIN generated logs. As of the launch of this TA there are still challenges specially when crafting detections and CIM compliance that can only be achieved by using Regex, which is necessary in this case to transform GIN raw logs into key indicators (IP Address, response times, endpoints).
Now that we can start ingesting and analyzing the information from Ollama logs we can use it as well with other defense technologies that can leverage attack techniques, tools and procedures such as MITRE ATT&CK and ATLAS, as these detections are also mapped to TTPs with the aim of helping analysts detect possible attacks.
Based on indicators such as abnormal crashes, resource exhaustion or prompt response length anomalies, analysts can also analyze model hashes to determine their legitimacy or possible forgery or backdoors.
EDR telemetry products may as well correlate possible abnormalities in process behavior, privilege escalation, code injection and lateral movement from compromised LLM instances. Products such as SNORT can be used to detect API exploitation attempts against Ollama specially if that instance is exposed to the internet.
Finally, now that we can obtain Ollama logs, Splunk SOAR can also be used to isolate compromised instances, revoke API keys, or capture forensic data and notify security teams.
Ollama has become very popular with local AI deployment, enabling developers and organizations to run sophisticated language models on their own infrastructure. However, as more teams adopt local LLMs, the need for comprehensive monitoring is critical. Although Ollama's local-first design provides some confidentiality and security within company networks, it simultaneously creates challenges around shadow AI—where untracked local LLM deployments emerge that are hard to oversee and protect.
The Splunk Technology Add-on for Ollama addresses these challenges, providing operational telemetry, log analytics, and prompt metadata monitoring to help organizations address shadow AI.
For all our tools and security content, please visit research.splunk.com.
The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.
Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.