Splunk users are familiar with real-time indexing, real-time search, and with release 4.2, real-time alerts. I’d like to take these concepts of real-time monitoring one step further to provide pro-active status of an entity while a search is being processed for entities (i.e., ip address, URL, hostname, etc) that are already in your index. I call this real-time status. For instance, suppose you already have URL’s indexed via Apache or IIS log files. Among the many things these events provide you are the HTTP status codes for the indexed URL’s per event. This only tells you the status code at the time of indexing. What used to be not found (code 404) could now as we speak be OK (code 200). Even worse, a URL that used to be OK in the last log event may now be producing a server error (code 500) when the next user hits it. Rather than wait for users to get to your URL so that it can generate a log event that can be indexed into Splunk to be analyzed and alerted upon, it would be useful to simply query for the current status in real-time.
With that mind, I experimented by creating a httpstatus command that you can download from Splunkbase. It will take an url field as input from your existing events, query the URL, and return a status code as you run the search. If you do not have a field called url, but do have URL’s in your events, you can simply use the eval command to create a new field or rename command to rename your field to url. Here’s sample usage:
index=sample sourcetype="sampleurl" myurl!=""|eval url=myurl|httpstatus|table url, httpstatus
Notice that it returns a new field called httpstatus providing you the real-time status of the URL in question. Since this runs a query over the internet, you may want to use only a handful (a few hundred) URL’s at a time, or at least dedup the url field, as the search can take a long time.
We won’t stop here. Just because you received an OK URL status does not mean the web site is properly functioning. With that in mind, I created another command on Splunkbase called httpget, which returns the first 1000 bytes of the URL in a new field called httpget, so that you can examine it with what you expect to get. In the example below, it appears that the first URL has issues by examining the data that it actually returns in real-time.
I actually ended up creating a series of these real-time status commands that can be downloaded from Splunkbase. Each comes with a sample log to test the data, (which happens to be indexed into an index called sample), README instructions, sample usage, and an explanation on time-outs if the entity is unreachable. The commands are:
- httpstatus – returns HTTP Status code
- httpget – returns first 1000 bytes of a URL
- pingstatus – returns if a machines can be pinged
- telnetstatus – returns a status if a telnet server with login is present
- fingerstatus – returns the result of a finger if a finger server is running
- ftpstatus – returns whether a FTP server is running
- traceroute – returns a simple traceroute from the Splunk indexer to the address
- ip2decimal (Ok, it is not a status command, but it is here for completeness)
Note that the pingstatus and traceroute status commands require that Splunk be running as an Administrator, root, or started as sudo root. The ip2decimal command does not use the Internet and does use a cache. Its main purpose would be to be able to do decimal style comparisons and sorting.
Lookup vs. Command
Since each command returns its results in a new field per event, it seems reasonable to have implemented each one as a dynamic Splunk lookup. I purposely left that as a vague implementation detail as a Splunk command allows the user to modify it and provide side-effects to the results such as modifying the view of the raw data upon completion. You are welcome to change each command’s included Python code to customize it for your environment including its output.
Hopefully, this article provides an insight on one way Splunk can provide real-time status for entities that have already been indexed in your events.