Visualising a Space of JA3 Signatures With Splunk

One common misconception about machine learning methodologies is that they can completely remove the need for humans to understand the data they are working with. In reality, it can often place a greater burden on an analyst or engineer to ensure that their data meets the requirements, cleanliness and standardization assumed by the methodologies used. However, when the complexity of the data becomes significant, how is a human supposed to keep up? One methodology is to use ML to find ways to keep a human in the loop!

Dimensionality reduction methods such as PCA, tSNE and UMAP allow us to take complex, encoded datasets and reduce them down to diagrams that allow us to bring human intuition and understanding back into our processes.

In January at SANS CyberThreat2022(3), I will explain how these techniques can be applied to JA3 TLS Signatures. Collecting TLS signatures can help you to keep track of known, unknown and malicious software. In addition to this presentation, I'm working with the SURGe team at Splunk to build on our work of investigating the use of JA3 signatures to mitigate Supply Chain attacks.

In short, these dimensionality reduction techniques allow us to take a set of JA3 hashes and some of the information comprising these signatures them and turn them into a map to show the the space of software communications in a dataset:

In applying tSNE to generate this Petri dish-like representation of JA3 signatures from the dataset available at ja3er.com, we see a number of structures that emerge when we plot these signatures in a 2D space. Every blue point in this diagram is a unique signature. Many signatures together form the clouds and clusters seen in this diagram. Signatures that are similar are close together and those that are different are forced apart, creating a simple and intuitive 2D representation of a very complicated dataset!

By pulling in some labels for this space, we can start to identify regions of this map where malicious software congregates and use this as a visual aid when threat-hunting or observing new and recurring traffic in our environment. This diagram shows some labeled malicious JA3 signatures (red) against the ja3er.com dataset.

So, if we see lots of activity near these malicious points in the future, that might be worth examining, since those communications will share a lot of the same structure and features as these malicious communications.

It’s also possible to generate maps of smaller spaces where we compare and contrast the behaviors of multiple hosts. The following example uses UMAP to visualize the clusters of behaviors seen across five different hosts on a single day. Points in clusters or close to others represent either identical or very similar JA3 signatures, and we can clearly see anomalous behavior on the green host as it sits in its own separate cluster. Could it be that this host is using different, unpatched, out of date or malicious software? Time to investigate!

OK, cool. But what can I do with this in Splunk?

I’ve implemented an example of using JA3 signatures to classify host TLS behaviors as an example in the latest version of Splunk’s App for Data Science and Deep Learning (DSDL). So feel free to grab it and take a look.

However, I believe that these sorts of advanced dimensionality reduction techniques are likely to be useful well beyond this simple example. We can hopefully take some of the more general but very complex datasets we see often in security and make them far more accessible. If you’d like to dig in further or just chat about what’s possible, please feel free to reach out to me on LinkedIn.

Related Articles

Predicting Cyber Fraud Through Real-World Events: Insights from Domain Registration Trends
Security
12 Minute Read

Predicting Cyber Fraud Through Real-World Events: Insights from Domain Registration Trends

By analyzing new domain registrations around major real-world events, researchers show how fraud campaigns take shape early, helping defenders spot threats before scams surface.
When Your Fraud Detection Tool Doubles as a Wellness Check: The Unexpected Intersection of Security and HR
Security
4 Minute Read

When Your Fraud Detection Tool Doubles as a Wellness Check: The Unexpected Intersection of Security and HR

Behavioral analytics can spot fraud and burnout. With UEBA built into Splunk ES Premier, one data set helps security and HR reduce risk, retain talent, faster.
Splunk Security Content for Threat Detection & Response: November Recap
Security
1 Minute Read

Splunk Security Content for Threat Detection & Response: November Recap

Discover Splunk's November security content updates, featuring enhanced Castle RAT threat detection, UAC bypass analytics, and deeper insights for validating detections on research.splunk.com.
Security Staff Picks To Read This Month, Handpicked by Splunk Experts
Security
2 Minute Read

Security Staff Picks To Read This Month, Handpicked by Splunk Experts

Our Splunk security experts share their favorite reads of the month so you can follow the most interesting, news-worthy, and innovative stories coming from the wide world of cybersecurity.
Behind the Walls: Techniques and Tactics in Castle RAT Client Malware
Security
10 Minute Read

Behind the Walls: Techniques and Tactics in Castle RAT Client Malware

Uncover CastleRAT malware's techniques (TTPs) and learn how to build Splunk detections using MITRE ATT&CK. Protect your network from this advanced RAT.
AI for Humans: A Beginner’s Field Guide
Security
12 Minute Read

AI for Humans: A Beginner’s Field Guide

Unlock AI with the our beginner's field guide. Demystify LLMs, Generative AI, and Agentic AI, exploring their evolution and critical cybersecurity applications.
Splunk Security Content for Threat Detection & Response: November 2025 Update
Security
5 Minute Read

Splunk Security Content for Threat Detection & Response: November 2025 Update

Learn about the latest security content from Splunk.
Operation Defend the North: What High-Pressure Cyber Exercises Teach Us About Resilience and How OneCisco Elevates It
Security
3 Minute Read

Operation Defend the North: What High-Pressure Cyber Exercises Teach Us About Resilience and How OneCisco Elevates It

The OneCisco approach is not about any single platform or toolset; it's about fusing visibility, analytics, and automation into a shared source of operational truth so that teams can act decisively, even in the fog of crisis.
Data Fit for a Sovereign: How to Consider Sovereignty in Your Digital Resilience Strategy
Security
5 Minute Read

Data Fit for a Sovereign: How to Consider Sovereignty in Your Digital Resilience Strategy

Explore how digital sovereignty shapes resilient strategies for European organisations. Learn how to balance control, compliance, and agility in your data infrastructure with Cisco and Splunk’s flexible, secure solutions for the AI era.