Are all my Microsoft Servers being Splunked?

I recently got asked a question – how can I tell if all my Microsoft servers are being Splunked? Interesting question and one that takes a little bit of effort. But we have all the bits, so let’s take a look at what it would take to answer that question. First off, let’s assume that by “Is a Server being Splunked?”, we mean that the server in question has a universal forwarder on it, is hooked into a deployment server, and is sending events to an indexer. All these bits need to have the events land within the same environment.

To answer this question, we need three pieces of information:

  • A list of all servers within the Microsoft Active Directory domain
  • The last time each server sent an event to an indexer
  • The last time each server checked into the deployment server

Let’s tackle each one in turn, starting with the last time each server sent us an event. We have a search command for that called metadata. This search commands returns a table of information about each host in an index. If we utilize the _internal index, which contains the Splunk logs, then we can get a good approximation of the last time something happened. Our command is:

| metadata type=hosts index=_internal | table host,lastTime

We can also utilize the _internal logs for the deployment server. The deployment server writes out a log entry to a component called DeploymentMetrics every time a server checks in with it. We can use this to find out all sorts of useful information – most of it is available when you run splunk list deploy-clients on your deployment server. However, the data is indexed, so we just need a simple search and stats command:

index=_internal sourcetype=splunkd component=DeploymentMetrics | stats latest(_time) as lastPollTime,latest(status) as status,latest(ip) as ip, latest(build) as build by hostname

The status will normally be ok unless an error occurred. The build is a six-digit number that is specific to the build of Splunk being used and is embedded in the filename of the Splunk Universal Forwarder that you download.

Our final piece of information needed is the list of Microsoft servers. We have a Splunk Addon for querying Active Directory called SA-ldapsearch. Once configured, it can provide the results of any LDAP search that we want to execute against Active Directory. In this case, we want to get a list of all computers that have been bound to the domain and have Server in their operating system field:

|ldapsearch domain=SHELL search="(&(operatingSystem=*Server*)(objectCategory=computer))" attrs="CN,operatingSystem"

In this case, SHELL is my domain, so make sure you replace that with your domain. Now, let’s put it all together:

| ldapsearch domain=SHELL search="(&(operatingSystem=*Server*)(objectCategory=computer))" attrs="CN,operatingSystem" | join type=outer [ metadata index=_internal type=hosts|table host,lastTime | rename host as cn] | join type=outer [search index=_internal sourcetype=splunkd component=DeploymentMetrics |s tats latest(status) as status,latest(ip) as ip,latest(build) as build by hostname | rename hostname as cn ]

I normally run this over the last 24 hours, but results can be correct in as little as 15 minutes. I also put this into a macro so that I can run it easily. Note that when you put this search in a macro, you need to remove the first pipe from the macro (so you macro starts ldapsearch...), then add the initial pipe back into the search command you enter, like those I’ve provided below.

What can we do with this? How about finding out which Microsoft servers do not have a Splunk Universal Forwarder installed?

| `SplunkServerCoverage` | where isnull(guid)

Microsoft servers with a Splunk universal forwarder that is not hooked into a deployment server?

| `SplunkServerCoverage` | where isnotnull(guid) AND isnull(status)

Microsoft servers that have an error on the deployment client?

|`SplunkServerCoverage` | where isnotnull(status) AND status!="ok"

Finally, servers with everything working, but no events in the last 15 minutes:

|`SplunkServerCoverage` | eval td=time()-lastTime | where td>900

By combining the power of Microsoft Active Directory and some simple Splunk search skills, you can manage your environment easily.

Posted by