TIPS & TRICKS

Extract and Alias Field Names in Splunk 4.0 Now

I’ve had this topic come up in several technical conversations lately, so I thought I would blog about it now.

Situation: You have two different source types containing common key field values, but the actual name of the field itself is different within each of the source types.

Question: How do you produce a report within Splunk that correlates all of these fields values together under one normalized field name?

Answer: Use the new FIELDALIAS and EXTRACT features included with Splunk 4.0 to normalize the field name at search-time.

Example: Let’s suppose you have two different types of call detail records, each containing a number that represents the total duration in seconds that someone is on a phone call.

One CDR event looks like this:

TELCOE,2.1,7e197787-655330a9-7a458301-70845177@12.13.20.20,,0,,H,,S,,sip:7622550@127.10.15.17:5050, sip:5558889999@120.10.20.20:55555,TELCO:Dallas,TX,0,sip:7622555@110.130.52.25:5050,NORTH:NORTH,200,0
,1,0,1,0,08/02/2009:05:03:21,08/02/2009:02:03:22,92,UNKNOWN,0,0

and the other CDR record looks like this:

TIME=20090802104826865|CHAN:332|SESSIONID:100102345|CALLDURATION:93|CALLINGNUM:5558431297|
CALLEDNUM:5559903894|UNIQID:8948373827100002938847889873474893

Now, let’s take a look at the Splunk configuration files to index these source types and extract the call duration values out into fields.

inputs.conf
[monitor:///$SPLUNK_HOME/etc/apps/cdr/logs/CDR.txt]
sourcetype= cdr_log

[monitor:///$SPLUNK_HOME/etc/apps/cdr/logs/cdr2.txt]
sourcetype= cdr2_log

props.conf
[cdr_log]
EXTRACT-calldur = ^.*?:\d\d:\d\d:\d\d,(?<callDuration>\d+),\w+,\d+\.\d+\.\d+\.\d+,

[cdr2_log]
REPORT-cdr2 = cdr2-kvpairs

transforms.conf
[cdr2-kvpairs]
DELIMS = "|", ":"

Now, notice that in the extraction of the call duration field in our example cdr_log sourcetype above uses the new EXTRACT option in props.conf to explicitly pull out and name the field we want. All we do is specify the regular expression to pattern-match the call duration number (in this case it’s “92” in our event) and name it as “callDuration”.

However, the extraction of the same type of field from cdr2_log uses the DELIMS option in transforms.conf and, therefore, the field name will be CALLDURATION in this case, which is in all caps. Doh!

What we want, though, is a report that includes values from BOTH fields AND that also refers to them by some common normalized field name. Using the new FIELDALIAS option, we can accomplish this. All we need to do is simply add an extra option to our props.conf file, called FIELDALIAS, to alias the field name CALLDURATION as callDuration, like this:

props.conf
[cdr_log]
EXTRACT-calldur = ^.*?:\d\d:\d\d:\d\d,(?<callDuration>\d+),\w+,\d+\.\d+\.\d+\.\d+,

[cdr2_log]
REPORT-cdr2 = cdr2-kvpairs
FIELDALIAS-callduration = CALLDURATION AS callDuration

Then we can perform the following search within the Splunk GUI, which refers to our normalized field name using the now common alias name, callDuration:

sourcetype=cdr*_log | top callDuration limit=10 | sort - callDuration

and produces are desired pie chart, which looks like this:

So there you have it. By using the EXTRACT and FIELDALIAS features available with Splunk 4.0, you can now normalize and correlate your field values across source types quickly, easily, and effectively with as little effort as possible.

…and any true Splunk user will tell you, that’s what Splunk 4.0 is all about.

BTW, if you haven’t experienced it yet, you should download Splunk now and see for yourself why Splunk is the first thing that comes to your mind when you need real-time ad-hoc searching capabilities now to determine root cause, thwart those pesky security threats, and gain that visibility into all of your application, server, and network log files at once, from one central web interface.

----------------------------------------------------
Thanks!
Eric Gardner

Splunk
Posted by

Splunk

Join the Discussion