TIPS & TRICKS

Quick N’ Dirty: Delimited Data, Sourcetypes, and You

Sometimes you have data.  It’s great data, it’s consistent data, and it would just be a heck of a lot more useful if Splunk knew each and every field.

You could always do it old school and use Splunk’s built in Interactive Field Extractor (also known as IFX).  Upside: it’s easy.  Downside: you’ll need to extract each field individually.  And if your data has, like, twenty columns, that’s a lot of extracting you’re doing.  there’s a faster way.

If your data is delimited, there’s an easier way to teach Splunk to understand it. As long as your data is consistently delimited…say with a space, comma, or tab…you can teach Splunk how to separate the data and how to label each field.

For example, consider the following data:

Sondra Russell,srussell@splunk.com,Sales Engineer
Blondra Blussell,brussell@splunk.com,Senior Sales Engineer

This data is comma delimited and the fields are: name, email, role.   So, here’s what you do:

  1. Define the delimiter/fields combo in transforms.conf. Open (or, if it doesn’t exist, create) the file $SPLUNK_HOME/etc/system/local/transforms.conf. This is the file where fields and transformations are defined.Now, tell transforms.conf what your delimiter is and what the fields are.  See the snippet below for what that looks like for my sample data.  Now that you’ve defined this particular delimiter/fields combo, it’s time to link it to a particular sourcetype…
  2. Link the delimiter/fields combo to your sourcetype in props.conf.    Open (or, if it doesn’t exist, create) the file $SPLUNK_HOME/etc/system/local/props.conf. This is the file that, among other things, defines sourcetypes.  Have you already indexed the data you want to define?  Find its sourcetype — it’s the word between the brackets. Are you starting from scratch?  Create a new sourcetype, name it anything you want.  Now, link your sourcetype with the delimiter/fields combo you defined in transforms.conf with the simple line REPORT-getfields: addressbook_fields.  In this case “getfields” could be anything you like — it’s just creating a new namespace — and “addressbook_fields” has to be the name of the stanza you created in transforms.conf.
  3. Do a couple housecleaning things.  If you’re creating a new sourcetype, you may want to add a couple other lines in props.conf.  “SHOULD_LINEMERGE = False” will force Splunk to read each new line of your raw data as a new event, and “pulldown_type=1” will put your new sourcetype in the list of available sourcetypes on the “add data” form.  R
  4. Reload props.conf and transforms.conf. In order for this to kick in, you will need to do a sort of soft reboot.  If you’re only changing props.conf and transforms.conf (as you are in this case), just open a new window in Splunk and do the following search.
     | extract reload=T
  5. Index your data and give yourself a high five. Now that you’ve taught Splunk the delimiter/fields combo for your data, you’re ready to enjoy the sweet field/value fruits of your labor.  If you’ve already indexed your data and have merely modified your sourcetype, you’re done!  Do a search and look for the new fields in the left nav.  If you haven’t already indexed your data, be sure to choose your new sourcetype in the pulldown on the “Add Data” form (hint: if you don’t see that option, check “More Settings”)/

transforms.conf

[addressbook_fields]
DELIMS=","
FIELDS = "name","email","role"

props.conf

[addressbook]
SHOULD_LINEMERGE = False
pulldown_type = 1
REPORT-getfields = addressbook_fields

For more documentation on this process:

  • Transforms.conf.  Detailed instructions on modifying transforms.conf.
  • Props.conf. Detailed instructions on modifying props.conf.
Sondra Russell
Posted by

Sondra Russell

Join the Discussion