In August 2025, ESET Research came across a proof-of-concept (POC) ransomware called PromptLock. Unlike typical ransomware families, this sample was built as part of an academic study to explore how large language models (LLMs) could be orchestrated to carry out ransomware-style attacks.
This POC showcases how locally hosted LLMs (the sample calls a gpt-oss:20b model via the Ollama API) can be abused to dynamically generate and run malicious Lua scripts that perform file enumeration, selective exfiltration and cross-platform payload actions.
PromptLock may just be a proof-of-concept, but it highlights a big shift: attackers can use local LLMs to make ransomware more adaptive and harder to predict, while lowering the technical barrier to creating it. For defenders, this means treating model runtimes as critical assets, tightening API and file access controls, watching for suspicious script activity, and focusing on behavior-based detection before these techniques move from research into real-world attacks.
In this blog, the Splunk Threat Research Team (STRT) breaks down the key takeaways from this type of malware, focusing on the test setup, tactics, techniques, and procedures (TTPs) involved and the detections that can help strengthen monitoring against techniques designed to slip past traditional security protections.
Ollama is a framework that allows users to download different LLM models and run them on their computers. This framework has become the most popular framework for running mainly small open source models with off the shelf laptops or desktops. Thanks to frameworks like Ollama now end users can simply download Ollama then choose from the several models available at Ollama model registry. These models vary in size and focus and they are very useful and at times even better than the very large ones that are paywalled on the internet.
The growing availability of these models presents both an opportunity and a risk. We are witnessing the rise of “Shadow AI,” where users independently download various AI frameworks and models without organizational oversight. As a result, companies often lack visibility into what employees are asking these models, what responses are being generated, and whether the binaries are backdoored or connecting to malicious servers to exfiltrate data. This uncontrolled adoption, driven by the desire to experiment and stay current with AI technology, introduces significant security vulnerabilities and increases the potential for exploitation.
As we will see in this blog analysis, it is possible to leverage local LLMs to produce malicious code that can further the execution of malicious payloads. Most importantly it is a matter of time until some specifically trained local models, with size enough to execute timely, would be deployed as part of upcoming payloads, effectively augmenting the capabilities of ANY initial access vector if linked appropriately.
Our setup consisted of a virtual machine running Microsoft Windows 11 with 16GB of RAM and 8 CPU cores. This is by no means a standard laptop or desktop, this research required enough computer power to execute this payload. The setup also included the Ollama framework (v0.11.9). The latest versions of Ollama framework have their own GUI unlike previous versions where a GUI framework was needed such as OpenWebUI.
Figure 01: Test Lab Setup
Previous PromptLock reverse engineering work also revealed that the model used in this malware via Ollama was gpt-oss.

As seen in the above screenshot gpt-oss from the Ollama model repository is described as one of OpenAI’s open-weight models. In the specific case of PromptLock the 20b version was used. Notice that 120b is also available. Let’s break down the differences in order to understand the use of LLMs as payload leverage.
The main difference between GPT-OSS 20B and 120B is their scale and performance: 120B has around 117–120 billion parameters and is substantially more powerful at advanced reasoning, math, health, and coding tasks than the 20B, which has about 20–21 billion parameters.
| Model | #Parameters | Active parameters / Token | Recommended Hardware | Context Window | Primary Use |
|---|---|---|---|---|---|
| GPT-OSS 120B | ~117B | 5.1B | 80GB GPU (Data Center) | 128K | Advanced/Enterprise |
| GPT-OSS 20B | ~21B | 3.6B | 16GB GPU (Consumer) | 128K | On Device |
The above comparison should give us some light on the use of LLMs for payloads. First it must be able to run on adequate hardware, second, if these LLMs (either opensource, or RAG, refined to be malicious) are to be downloaded, there is clearly a consideration of size when it comes to transference and local storage. GPT-OSS 20B takes up approximately 13G of space on a VM HD. This indicates that the feasibility of executing a payload that needs to download such a big size model file is simply not feasible. If parameter numbers do make a difference in model capabilities using GPT-OSS 20B should be capable of “Chain of Thought reasoning” which may also explain why it was chosen to generate payload. However there are other very small models such as llama3.2 which would not exceed 2GB and even though they are considered to have “few-shot” learning capabilities, they may still be able to create effective payload artifacts.
The size of the hardcoded model provided us with a hint that this payload was indeed something like research instead of a production payload. In order to leverage a model with that size it is more practical to focus on model registries themselves or identify in place local models to execute payload, something more feasible as many users are using Ollama to use local models.
In this setup we also set up Microsoft Sysmon to obtain logs as we executed the payload and we also collected the Ollama logs, AD Events and recorded the entire interaction via Wireshark.

Figure 02: Test Lab and Sysmon Setup
Once we completed the execution and record of this payload we proceeded to extract the logs and upload them to Splunk so we could develop some detections for payloads similar to PromptLock.
The STRT is working around the clock to provide our customers with a Ollama TA that allows parsing from logs and CIM compatibility as shown in this next screenshot. You can also download it from Splunkbase.
Figure 03: Ollama Audit Logs
Once we have the data loaded via Splunk Ollama TA we can craft some initial detections here are a couple of examples:

Figure 04: Ollama Time Splunk Search
The SPL search above identifies three distinct threat patterns requiring attention. The detection framework successfully flagged potentially suspicious activities including multi-model reconnaissance operations (57 events with 12% error rate), multiple startup anomalies indicating potential crash loop exploitation (55 events), and resource exhaustion indicators with degraded system performance (90 events). These patterns are shown by promptlock suggesting persistent adversarial activity targeting the local LLM deployment infrastructure (When we tested promptlock).
The security implications of these findings indicate that with a payload like promptlock adversaries can execute a sophisticated attack campaign targeting the organization's AI infrastructure through multiple vectors. The observed multi-model pulling behavior represents cl reconnaissance tactics enumerating available models and capabilities prior to launching targeted attacks. Combined with the crash loop exploitation attempts and resource exhaustion indicators, this suggests the payload is designed based on knowledge of LLM infrastructure vulnerabilities attempting to achieve denial of service conditions or identify exploitable weaknesses in model initialization processes.
High-volume model pulling could represent legitimate DevOps activities such as automated CI/CD pipelines, container orchestration updates, or routine model version synchronization across distributed environments. Similarly, multiple startup events and slow initialization times may simply indicate resource contention during peak usage periods, insufficient hardware provisioning for concurrent inference requests, or legitimate performance degradation due to large model sizes exceeding available system memory.
Before escalating these findings as confirmed security incidents, the security operations team should correlate these events with user activity logs to establish whether the patterns align with authorized business operations. Additionally, establishing baseline performance metrics during known-good operational periods and implementing tiered alerting thresholds will help distinguish between genuine attack indicators and system behavior resulting from organic workload increases or infrastructure limitations. Splunk STRT has released a TA-ollama that can allow security analysts to ingest Ollama logs and create the above mentioned analytics.
The PoC is written in Go and reaches out to three primary URLs: two serve as data exfiltration endpoints and the third interfaces with the Ollama API. Through that Ollama connection the malware sends prompts which produce or validate Lua code; those dynamically generated scripts are then executed on the host as part of the infection and payload-delivery process.
| URL Links |
|---|
| hxxps://<hardcoded IP>:8843/backup/files |
| hxxps://<hardcoded IP>:8843/backup/scripts |
| hxxps://<hardcoded IP>:11434/ollama/v1/chat/completions |
Figure 05: Promptlock targeted endpoint URLs

Figure 06: PromptLock URL Endpoint Setup
This POC will send several prompts to the Ollama chat API to execute different tasks like reconnaissance, data exfiltration, data destruction (encryption and deletion) and code validation for requests that generate Lua payload.
To test this network communication, the STRT team set up a local Ollama HTTP API and redirected the sample to that endpoint. This allowed us to observe several prompt commands sent by the PromptLock PoC ransomware, as shown in Figure 07. The screenshot illustrates a prompt requesting Lua code designed to collect common system information from the targeted host and the response of the Ollama API using gpt-osss:20b LLM model.
Figure 07: Prompt Request and Ollam Response
During our analysis and testing, we observed timeouts and delays in Ollama’s responses, likely due to the limited resources of our test environment. To work around this, we extracted all the prompt messages that the PromptLock PoC ransomware would attempt to send to the Ollama API and submitted them manually to capture the responses and the requested Lua code payloads.
Figure 08 shows a snippet of the list of prompt messages that this PoC ransomware may use to orchestrate its behavior. The screenshot shows several tasks it wants to execute such as:
Figure 08: Prompt Messages
Capturing the prompt message from the PoC code and sending it manually to Ollama to generate secure-delete Lua code allows the STRT to quickly produce potential payloads for detection testing. Figure 09 shows the Ollama response and the time it takes to process the prompt, which may vary depending on the capacity of our test environment.
Figure 09: Prompt Messages for Generating Secure Delete Lua Script
Additionally, this PoC generates four random DWORD keys that are used for file encryption. Figure 10 shows the IDA Hex-Rays view of the code responsible for generating these random DWORD values.
Figure 10: Random Keys
The four randomly generated DWORD values are used as an encryption key. The PoC then sends a prompt to Ollama to generate Lua code that encrypts the files listed in “target_file_list.log” using the SPECK block cipher in ECB mode algorithm. Figure 11 shows the snippet of the prompt message that is being sent to the Ollama to generate the encryptor payload.
Figure 11: Promptlock prompts to generate Encryptor.
By resubmitting the prompt to Ollama manually, the STRT team was able to capture a Lua script that could encrypt files, which we used for testing. Figure 12 shows a portion of Ollama’s response to the prompt (shown later in Figure 11), which generates Lua code to perform the tasks described in the prompt.
Figure 12: Promptlock File Encryptor.
Figure 13 shows the before-and-after results of the STRT’s testing with the generated Lua code for file encryption. After minor tweaks, the generated lua script successfully encrypts the target files as intended.
Figure 13: Encryption Testing
The following analytic identifies a high frequency of file deletions by monitoring Sysmon EventCodes 23 and 26 for specific file extensions.
`sysmon` EventCode IN ("23","26") TargetFilename IN ("*.cmd", "*.ini","*.gif",
"*.jpg", "*.jpeg", "*.db", "*.ps1", "*.doc", "*.docx", "*.xls", "*.xlsx", "*.ppt",
"*.pptx", "*.bmp","*.zip", "*.rar", "*.7z", "*.chm", "*.png", "*.log", "*.vbs",
"*.js", "*.vhd", "*.bak", "*.wbcat", "*.bkf" , "*.backup*", "*.dsk", "*.win") NOT TargetFilename IN ("*\\INetCache\\Content.Outlook\\*")
| stats count min(_time) as firstTime, max(_time) as lastTime values(file_path) as file_path values(file_hash) as file_hash values(file_name) as file_name values(file_modify_time) as file_modify_time values(process_name) as process_name values(process_path) as process_path values(process_guid) as process_guid values(process_id) as process_id values(process_exec) as process_exec
by action dest dvc signature signature_id user user_id vendor_product
| where count >= 100
| `security_content_ctime(firstTime)`
| `security_content_ctime(lastTime)`
| `windows_high_file_deletion_frequency_filter`
Figure 14: Windows High File Deletion Frequency Detection
The following analytic detects the use of Windows Curl.exe to upload a file to a remote destination. It identifies command-line arguments such as `-T`, `--upload-file`, `-d`, `--data`, and `-F` in process execution logs. This activity is significant because adversaries may use Curl to exfiltrate data or upload malicious payloads.
| tstats `security_content_summariesonly` count min(_time) as firstTime max(_time) as lastTime from datamodel=Endpoint.Processes where `process_curl` Processes.process IN ("*-T *","*--upload-file *", "*-d *", "*--data *", "*-F *")
by Processes.action Processes.dest Processes.original_file_name Processes.parent_process Processes.parent_process_exec Processes.parent_process_guid Processes.parent_process_id Processes.parent_process_name Processes.parent_process_path Processes.process Processes.process_exec Processes.process_guid Processes.process_hash Processes.process_id Processes.process_integrity_level Processes.process_name
Processes.process_path Processes.user Processes.user_id Processes.vendor_product
| `drop_dm_object_name(Processes)`
| `security_content_ctime(firstTime)`
| `security_content_ctime(lastTime)`
| `windows_curl_upload_to_remote_destination_filter`
Figure 15: Windows Curl Upload to Remote Destination Detection
Overall PromptLock Splunk Analytic Story consists of 6 detections.
| SHA256 Hashes | Description |
|---|---|
| 1458b6dc98a878f237bfb3c3f354ea6e12d76e340cefe55d6a1c9c7eb64c9aee | Promptlock |
| 1612ab799df51a7f1169d3f47ea129356b42c8ad81286d05b0256f80c17d4089 | Promptlock |
PromptLock even as a POC is definitely a telling sign of how GenAI can be used to enhance and extend malicious payloads. Why multi-stage and download several files when they can be created, crafted and customized to a level that was not possible before?. A local model that is fed actual information of present defense technologies, their patch versions, their task schedules, their update cycles, their user patterns would definitely produce much more effective hard to detect and contain payloads.
It is fundamental to start addressing “Shadow AI” and have visibility on all these frameworks that are becoming increasingly popular and widespread at the user level in their drive to use AI.
This blog helps security analysts, blue teamers and Splunk customers identify PromptLock ransomware by enabling the community to discover related tactics, techniques, and procedures used by threat actors and adversaries. You can implement the detections in this blog using the Enterprise Security Content Updates app or the Splunk Security Essentials app. To view the Splunk Threat Research Team's complete security content repository, visit research.splunk.com.
Any feedback or requests? Feel free to put in an issue on Github and we’ll follow up. Alternatively, join us on the Slack channel #security-research. Follow these instructions If you need an invitation to our Splunk user groups on Slack.
We would like to thank Teoderick Contreras and Rod Soto for authoring this post and the entire Splunk Threat Research Team for their contributions: Michael Haag, Nasreddine Bencherchali, Lou Stella, Bhavin Patel, Eric McGinnis, Patrick Bareiss, Raven Tait and Jose Hernandez.
The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.
Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.