Detecting New Domains in Splunk (Finding New Evil)

In this installment of Hunting with Splunk we’re showing you how to detect suspicious and potentially malicious network traffic to “new” domains.

First, let’s delve into what we mean by “new” domains and why you should make a habit of detecting this activity in the first place.

(Part of our Threat Hunting with Splunk series, this article was originally written by Andrew Dauria. We've updated it recently to maximize your value.)

The importance of detecting “new” domains regularly

Users (and applications) are creatures of habit. If your organization is typical, the domains that are requested from your network today have a tremendous amount of overlap with yesterday’s requests. If I look at my browsing, I visit the same 20 or so websites on a daily basis. (After all, I can’t miss today’s xkcd or Dilbert!)

We can say the same of our apps. Applications making network requests from my laptop, phone, and other systems generally hit the same domains day in and day out as well.

But what about the small percentage of internet domains that were requested from my network today, but were not previous destinations for my systems? This is what I mean by “new” domains.

Sure, there’s going to be some legitimate traffic going to a few domains today that haven’t been seen on the network before — still, it’s likely to be a small percentage of the overall set of domains. The remainder of these new, never-seen-before domains represent a potential threat.

Why should you consider network activity to new domains to be suspicious and potentially malicious? Malware and malicious actors regularly use domains they own or control for a variety of nefarious purposes.

For example, an attacker-controlled domain can be used as a hub for command and control communications, while another domain is used for data exfiltration. These domains can leverage:

Dynamic domains
Subdomains created with domain generation algorithms (DGAs)
More legitimate-looking, human-readable domains
Other techniques

These are all standard techniques seen across a wide variety of modern attacks. By hunting for these new domains, we can increase the chances of finding threats and then quickly shift to investigation and mitigation.

Detecting new domains with Splunk

With that background, let’s move into the how-to portion of using Splunk to detect new domains.

Data required

With this backdrop, let’s discuss what data is needed. The short answer is: any data in Splunk that has a field containing network requests to external domains. This could include data that neatly parses out a domain field. Alternatively, we can extract domains from URLs.

Perhaps the best place to look for this data is in your web proxy logs. If you have this data in Splunk, you already have a massive repository of URLs being requested from your network. Use the free Splunkbase app URL Toolbox to extract domains from a URL.

Another good source of network traffic with domain requests is DNS data. You can get this from your outward-facing DNS servers or with a wire data collection tool like Splunk Stream to pull this data from the wire in JSON format.

The approach

With tremendous thanks to SPL guru David Veuve, let’s dive into some SPL ideas to hunt for these new domains. For these searches to work, be sure you’ve installed the free URL Toolbox app.

Start by validating that you have data by pulling a list of domains with the earliest and latest times we’ve seen from our proxy logs within the last 15 minutes:

Here’s our search for ease of viewing. Let’s go through how this works line-by-line:

tag=web url=*
| eval list="mozilla"
| `ut_parse_extended(url,list)`
| stats earliest(_time) as earliest latest(_time) as latest by ut_domain

The first line searches our proxy data by checking for a value in the URL field. You may have to change this in your environment by limiting it to a specific index and/or sourcetype instead of — or in addition to — using the web tag. You can also optimize your search here by filtering out noisy domains like:

Content delivery networks (CDNs)
IPs
IP ranges

The eval command creates a new field, which we will then pass to the URL Toolbox macro on the following line. It tells the macro in what format to expect the URL. The macro itself takes 2 parameters:

The name of the field containing a URL (in this case, url).
The format of the URL (in this case, “mozilla”).

Note that, as with all macros in Splunk, these are `backticks` and not 'single quotes.'

The stats command simply creates a table with the most recent time (latest) and the first time (earliest) we’ve seen this domain in our dataset grouped by the field ut_domain that was extracted when the macro was executed.

With any luck, your data should look somewhat like mine at this point. To find today’s “new domains,” compare today’s domain requests to a baseline of the previous 6 days. This is easy enough to do by setting our timeframe for 7 days and expanding our previous search:

tag=web url=*
| eval list="mozilla"
| `ut_parse_extended(url,list)`
| stats earliest(_time) as earliest latest(_time) as latest by ut_domain
| eval isOutlier=if(earliest >= relative_time(now(), "-1d@d"), 1, 0)
| convert ctime(earliest) ctime(latest)

The first 4 lines are the same as our original search. The 5th line is where the magic happens:

The eval command creates a new field called isOutlier. This command uses an if() function to determine if the earliest (first) time we’ve seen this domain in the dataset was within the last day (using the now() and relative_time() functions available in eval).
The final line uses the convert command with the ctime() function to make the time field human-readable.

At this point, we can sort on the isOutlier field (click the column heading) to find our new domains.

Alternatively, we can add | where isOutlier=1 to return only the new domains. If we wanted an alert, we could save the search after adding the where command and be notified when new domains are found.

While this search does get the job done, it may not be optimal over the long term. That’s for a couple reasons:

First, as you might have noticed if you tried to run it, it can be slow.
Second, we need to pull back 7 days' worth of data every time we run it, and even then, we’re limited in our view because we are only comparing today to the previous 7 days. What if last week included a federal holiday? Or if everyone decided to attend .conf? This might not be a large enough baseline to avoid false positives.

Operationalizing & tuning

Luckily, with the power of Splunk, we can solve these problems in a couple of ways. Basically, we are going to:

First, use Splunk’s lookup functionality to create a cache of previously seen domains.
Then, run the search for new domains across a much smaller set of data, comparing it to the cache and updating the cache at the same time.

We will start by using Splunk to create an initial baseline cache for the previous 30 days, and writing it to a CSV lookup file in Splunk. This search will likely take a while to run, but after you’ve run it once, you won’t have to do this again. Our baseline-populating search looks something like this:

tag=web url=*
| eval list="mozilla"
| `ut_parse_extended(url,list)`
| stats earliest(_time) as earliest latest(_time) as latest by ut_domain
| outputlookup previously_seen_domains.csv

This looks very similar to our very first search, but now we are going to write out a CSV file that we’re going to use in the next search.

Once we have the baseline, we can create a search that compares the domains requested in the previous 15 minutes to the baseline. The search will update the CSV file with the new data (updating earliest and latest times for previously seen domains, and adding rows for new ones), while at the same time flagging any outliers. It will look something like this:

tag=web url=* earliest=-15m
| eval list="mozilla"
| `ut_parse_extended(url,list)`
| stats earliest(_time) as earliest latest(_time) as latest by ut_domain
| inputlookup append=t previously_seen_domains.csv
| stats min(earliest) as earliest max(latest) as latest by ut_domain
| outputlookup previously_seen_domains.csv
| eval isOutlier=if(earliest >= relative_time(now(), "-1d@d"), 1, 0)
| convert ctime(earliest) ctime(latest)
| where isOutlier=1

The first 4 lines work as before, except this time we are limiting the search to the previous 15 minutes. The idea is that we will run this search as a correlation or alert search. If Splunk returns any hits, we can use the Adaptive Response or Alerting frameworks to trigger actions, such as:

Modifying a risk score in Splunk Enterprise Security.
Kicking off a workflow.

The inputlookup command on the 5th line uses the append flag to retrieve the CSV file we created in our baseline step and add it to our data set from the last 15 minutes.

We then use the stats command on the 6th line to look at the “earliest earliest” and “latest latest” time for each domain in the dataset. This allows us to see the combined data from the previous 15 minutes and the baseline domain list. When executing the stats command and grouping by ut_domain, you’ll see:

The latest field may be updated if the domain was previously in the CSV.
New domains previously not in the CSV are added with their earliest and latest seen times.

The remainder of the search simply writes the updated table to the same CSV lookup file, flags the outliers (new domains) as before, and cleans up the time formatting. Since we are alerting, the where command filters for just the outliers so you can take action on them.

Let's TSTATS that search

Another way to optimize this search is to apply CIM-compliant accelerated data models to the search. All of the same principles from our previous searches apply, but we’re going to take advantage of the speed of tstats (Not familiar with tstats? Learn how to use tstats for threat hunting.)

Assuming you have CIM-compliant data and populated data models, you can test the search by manually running it across 7 days like this:

| tstats count from datamodel=Web by Web.url _time
| rename "Web.url" as "uri"
| eval list="mozilla"
| `ut_parse_extended(uri,list)`
| stats earliest(_time) as earliest latest(_time) as latest by ut_domain
| eval isOutlier=if(earliest >= relative_time(now(),"-1d@d"), 1, 0)
| convert ctime(earliest) ctime(latest)

The difference between this search and our other search methods is evident in the first two lines, where:

We use tstats, which is faster since it searches on index-time fields instead of raw events.
We then rename the default Web.url field to uri before passing it to the macro.

The rest of the search is exactly the same, but it runs MUCH faster.

To operationalize it with lookups, as above, we just need to make a few changes. The initial lookup populating search will look like this:

| tstats count from datamodel=Web by Web.url _time
| rename "Web.url" as "uri"
| eval list="mozilla"
| `ut_parse_extended(uri,list)`
| stats earliest(_time) as earliest latest(_time) as latest by ut_domain
| outputlookup previously_seen_domains.csv

Similarly, an operationalized search would run every 15 minutes or so, using the lookup file to expand our time range and improve performance like this:

| tstats count from datamodel=Web by Web.url _time
| rename "Web.url" as "uri"
| eval list="mozilla"
| `ut_parse_extended(uri,list)`
| stats earliest(_time) as earliest latest(_time) as latest by ut_domain
| inputlookup append=t previously_seen_domains.csv
| stats min(earliest) as earliest max(latest) as latest by ut_domain
| outputlookup previously_seen_domains.csv
| eval isOutlier=if(earliest >= relative_time(now(), "-1d@d"), 1, 0)
| convert ctime(earliest) ctime(latest)
| where isOutlier=1

Continue hunting those new domains

Hopefully, this gives you a number of methods to hunt for new domains and perhaps provides other ideas for more “first seen” threat hunting. To further explore these concepts, I strongly recommend checking out the awesome Splunk Security Essentials app on Splunkbase.

As always, happy hunting!

Style

two-column

Add To Chrome? - Part 4: Threat Hunting in 3-Dimensions: M-ATH in the Chrome Web Store

Security

5 Minute Read

Add To Chrome? - Part 4: Threat Hunting in 3-Dimensions: M-ATH in the Chrome Web Store

SURGe experiments with a method to find masquerading using M-ATH with Splunk and the DSDL App.

Using eval to Calculate, Appraise, Classify, Estimate & Threat Hunt

Security

5 Minute Read

Using eval to Calculate, Appraise, Classify, Estimate & Threat Hunt

This article discusses a foundational capability within Splunk — the eval command. Need to pick a couple commands for your desert island collection? eval should be one!