
This blog post is part eighteen of the "Hunting with Splunk: The Basics" series. Quickly observing your surroundings, orientating yourself by those observations, and acting upon them is an essential skill. You should look up OODA loop if you are interested. :-) John Stoner gives an excellent overview of some tips and tricks to finding your way around a new environment whether it's in BOTS or a new job. – Ryan Kovar
Congratulations to you for signing up for Boss of the SOC (BOTS) at .conf18! Wait, are you saying that everyone reading this hasn't signed up for BOTS? Inconceivable!
Well, if you haven’t, that’s okay—this blog post isn't just for those coming to .conf18 and playing BOTS version 3 on Monday, October 1st. This post is for anyone who is interested in getting some tips and tricksaround threat hunting; these concepts can be used by those heading to Orlando for .conf18, by users playing other versions of BOTS, or those who are hunting on a daily basis.
This blog post is broken into six sections, starting with the thought process that goes into a hunt. We move to getting to the relevant time and data of interest, applying the appropriate searches, and then wrap up with using external information where appropriate to arrive at your answer:
- Process of a Hunt
- Focusing My Hunt
- Data Sources & Context
- Searching in Splunk
- Splunk Commands
- Open Source Intelligence
For the purpose of this post, we're operating under the assumption that the data is already in Splunk. There are many articles written about getting data into Splunk, so this is focused on the analyst getting information back OUT.
Process
At the start of a hunt, I want to establish an objective. In BOTS competition, we provide this in the form of questions to answer that will guide what and where you should hunt. For those not doing BOTS, the development of a hypothesis is recommended to guide your hunt. If you aren't sure what to hunt for, I recommend taking a look at the MITRE ATT&CK framework as a method to establish guardrails around your hunt. If I can hypothesize that PowerShell is running on my Windows systems, that provides a technique to focus my hunt on and not get caught looking at other bright shiny objects.
That said, if you do find other bright shiny objects during your hunt, take note of them and then use them to build hypotheses for subsequent hunts. Taking those notes is also crucial for BOTS, so bring a pen and notebook!
Focusing My Hunt
When I look at my Splunk console, I may have hundreds of data sources (sourcetype) stretching over days, weeks, months or years. One of the first steps I need to do is narrow that very broad field of data and time to something more specific. That doesn’t mean I won’t need to pivot back to a broader search, but to be effective, I need to start narrowing my focus.
How do we do focus? Let’s start with time. In Splunk, on the right side of the search, a drop-down allows me to set the time range that the search will run within. Clicking on the drop-down returns a number of time presets, as well as the ability to search specific data and time ranges. The use of Time Picker in searches will be incredibly important during any hunt, including BOTS.
To effectively focus on specific data sources—or sourcetypes—I need to understand what sourcetypes are available. To quickly determine the sourcetypes available, I can use the metadata command like this:
| metadata type=sourcetypes | sort - totalCount
My search provides a list of the sourcetypes, the number of events based on the time range and the first, last and recent time seen. For more information, check out this wonderful blog post by Mickey Perre, "MetaData > MetaLore."
Data Sources
Now that I have a hypothesis (or a question to answer), a time boundary and sourcetypes, I can start digging into the data. What kind of data should I focus on? It will depend based on the hypothesis or question being asked. In my earlier example, if I'm hunting PowerShell, I probably want to focus on host-based data sources like Microsoft Event Logs and/or Microsoft Sysmon. That isn’t to say that I won’t end up looking at network data sources, but it'll help me initially focus my hunt.
If on the other hand, I'm hunting for indications of exfiltration using data compression, I might start by focusing on network data sources. Network data sources can help me determine what data was sent and in which direction. Understanding if data is flowing to my cloud provider or from my servers to my workstations are important pieces of information to gather. Network data sources can include firewalls and web proxies as well as wire data; wire data can be seen in the form of Splunk for Stream which is broken out by network protocols including TCP, HTTP, SMTP, DNS and many more. Your organization may not be running Splunk for Stream, but you may have PCAP data or Bro and these data sets can provide other valuable insight into the specific protocols operating on your network.
Context
In addition to log events, I want contextual data to better understand the network, systems and users. Asset and identity data provides context, including who owns specific systems and the departments users work in. That context may provide a clue that an individual’s workstation connecting to a specific server is suspect. Understanding where systems reside as well as their addressing is crucial for hunting. If I see activity in my workstation address space but don’t recognize that the source IP is part of my enterprise, I can waste precious time hunting for activity from a source that doesn’t pose a threat to my systems.
Threat intelligence can be helpful particularly if I get external indicators that can be hunted for in the context of my environment. That said, if I find these indicators, it may indicate that my organization won't have a great day! Within Splunk Enterprise Security, I can save a copy of my network topology using Glass Tables so context is immediately at my fingertips when I need it.
Searching in Splunk
Now that we have data, context, and can narrow our time frame, let’s look at Splunk searches. I can execute unstructured or structured searches in Splunk and get results. When hunting, I want to make my search broad initially unless I know precisely what I'm looking for. I don’t want to write the most beautiful Splunk search (and if you've ever seen my searches, you know that won’t likely happen...) and not get any results back; I'd rather start broad and then refine my search to tighten my net. I can review my search results and use the Selected Fields and Interesting Fields on the left side of the screen to review specific field values as well as pivot on specific fields to refine my search.
In this example, we're searching for events on August 23, 2017, and searching our Microsoft Sysmon data.
sourcetype="xmlwineventlog:microsoft-windows-sysmon/operational"
My search returns over 40K events, but by using the fields available to me, I can narrow my search dramatically if I'm hunting for activity that Amber Turing is performing. I can do this on multiple fields just by pointing and clicking!
sourcetype="xmlwineventlog:microsoft-windows-sysmon/operational" user="FROTHLY\\amber.turing"
From my results, I can see that Amber seems to be running tor.exe on her system. Interesting. I can then start using the awesome Splunk transformational commands to finesse my data. What's a transformational command? These are commands that take the output of a search and transform the data output using functions as simple as sort or tail but can also perform calculations and comparisons using commands like stats, eval, transaction and rex.
Handy Command References
Splunk publishes a helpful command reference—which I always keep near—that should be leveraged during BOTS and your hunts! If you aren’t familiar with the commands in Splunk and you generally use keywords for searching, no worries! The "Hunting with Splunk: The Basics" blog series has you covered. This series of 17 (and still growing) blogs cover hunting from a technique perspective, but also from a Splunk command viewpoint and provides example syntax that can be leveraged. That said, if I was stuck on a desert island with only two Splunk commands, I would start with stats and eval because they're so powerful. Here's an example of using both of them in concert with one another.
sourcetype="pan:traffic" (src_ip=10.0.2.101 OR dest_ip=10.0.2.101) | stats count AS event_count sum(bytes_in) AS bytes_in sum(bytes_out) AS bytes_out sum(bytes) as bytes_total by src_ip dest_ip | eval mb_in=round((bytes_in/1024/1024),2) | eval mb_out=round((bytes_out/1024/1024),2) | eval mb_total=round((bytes_total/1024/1024),2) | fields - bytes* | sort - mb_total | head 10
In this example, I want to see what communication paths existed between Amber’s system and other systems. Because I have contextual information, I know her IP address is 10.0.2.101 and so my initial search is looking at the firewall data with her IP address being either the source or destination. I use the stats command to sum the bytes_in, bytes_out and bytes fields, and generate a count of events based on the unique combination of source and destination addresses. The eval command is used to create a new field that calculates MBs instead of bytes and is rounded to two decimal places.
I could stop there because I said those two commands were my favorites, but I'll throw a few extra commands in to show you what I can do from there. I use the fields command to exclude the original byte fields from my result set. I can sort from largest to smallest the mb_total field and I returned the top 10 results. With that, I have a top 10 talkers list between a system of interest and the rest of the world. Pretty cool, huh?
Open Source Intelligence (OSINT)
The last important component to keep in mind when competing in BOTS or going hunting is OSINT. My favorite OSINT site starts with the letter G. Anyone? Google.com. That’s right, Google is an often-underused weapon when hunting. I don’t know about you, but I just can’t seem to remember all 1000+ Windows Event codes, so being able to quickly search for this kind of information is invaluable.
After Google, sites like VirusTotal and RiskIQ that can be very helpful for researching malware and passive DNS, respectively. While WHOIS data is undergoing some changes with the implementation of GDPR, there are still many other OSINT data sets that are helpful. One other site that may be useful is Censys.IO, particularly if I am trying to correlate SSL certificates to adversary infrastructure.
Wow, we covered a lot of ground in a short time on some hunting with Splunk techniques that you can use for BOTS as well as back at the office. If you're interested in hunting on some datasets to keep your skills sharp, try out some new techniques or just practice your Splunk search skills, you can head to the Splunk Security Dataset Project to register and hunt in your own sandbox. If you decide to put the BOTS v1 data into its own instance, the Boss of the SOC Investigation Workshop for Splunk app can be installed to practice some of the techniques used to answer BOTS questions. This app is already embedded in the dataset project.
I hope this was helpful and that these tips will help your daily threat hunting as well as your enjoyment of BOTS. Best of luck to you all competing at .conf18!
----------------------------------------------------
Thanks!
John Stoner