TIPS & TRICKS

Searching Smarter and Faster with Splunk 4

Hi Splunkers, Dave here from the Search and Index team at Splunk. Coming from an engineering perspective, I’m excited about Splunk 4 because it represents a monumental improvement in search power. Not only is search about ten times faster than the previous release, but we have added several new features that empower users to search smarter and faster. This blog post is going to highlight just a few of these new features.

Asynchronous Search

Let’s start with a basic search for “Not Found” errors in web access logs via the UI:

status=404

The first thing you’ll notice is that you get events right away, with the timeline marching back as you get more results. Search is now asynchronous, meaning there’s no more waiting for a search to complete to get results. You’ll be able to find answers and start troubleshooting faster than ever.

Lazy Key-Value Extraction

The results from the above search will keep all the information about fields we extracted from the data, such as status code, number of bytes, and the referrer’s URI. But let’s say that the only thing you care about in the results is the client’s IP. Then you can use the fields command to make your search even faster:

status=404 | fields clientip

Using this technique, you’re telling the search that you only need the field clientip, which now limits what Splunk extracts from the data. This saves a lot of processing time, and on my laptop the search ran almost TWICE as fast. This is an extremely powerful technique to use when you know what fields you care about in the results.

CIDR Subnet Matching

So now that we’ve found what client IPs have received 404 errors, let’s limit those IPs to a particular subnet. If you’re not familiar with CIDR (Classless Inter-domain Routing) subnets, it’s simply a way of describing which IP addresses fall into a particular network.

In Splunk 3.x, if you wanted to specify a subnet of 64.0.0.0/6, you would need a search like this:

status=404 (clientip=64.*.*.* OR clientip=65.*.*.* OR clientip=66.*.*.* OR clientip=67.*.*.*)

Not only is this search ugly, but it’s slower than it needs to be. Keep in mind this is just a simple example where the top octet in the IP address only has 4 different possibilities – in the worst case you would need 128 different clientip comparisons!

For Splunk 4 we added automatic CIDR subnet detection when comparing a field, which is cleaner and faster. The above search simply becomes:

status=404 clientip=64.0.0.0/6

Multi-Index Search

In previous versions, users were limited to searching one index at a time. For Splunk 4 we overhauled the search system to allow searching over any number of indexes at the same time.

Here’s an example that searches for errors over all accessible public indexes:

error OR fatal index=*

It’s that simple. For admins that have access to internal indexes, you can access all of the events with the following search:

index=* OR index=_*

Perhaps the most useful application of multi-index search is the ability to partition different searches to different indexes. For example, if you wanted to search for 404 errors in your web index and undeliverable messages in your mail index, you could use the following search:

(index=web status=404) OR (index=mail undeliverable)

These partitioned searches are fast. They are highly optimized and each index only searches what is relevant.

Furthermore, index search options are easily customized under “Roles” in the Manager page. Roles control what indexes users are allowed to access, as well as which indexes they search by default. For example, as an admin I sometimes find it useful to set my defaults to all internal and public indexes, so my searches hit all indexes.

These are just a few of the powerful new features in Splunk 4. I encourage you all to try them out to help you find results faster and easier.

Happy Splunking!

Dave Marquardt
Posted by

Dave Marquardt

Join the Discussion