This documentation applies to the following versions of Splunk: 4.0 , 4.0.1 , 4.0.2 , 4.0.3 , 4.0.4 , 4.0.5 , 4.0.6 , 4.0.7 , 4.0.8 , 4.0.9 , 4.0.10
As you use Splunk, you will likely encounter situations that require the creation of new fields that would be additions to the set of fields that Splunk automatically finds for you at index and search time. As a knowledge manager, you may be in a position where you are managing field extractions for the rest of your team. For example, some Splunk knowledge managers utilize field extractions as part of an event data normalizaton strategy, redefining existing fields and creating new ones in an effort to reduce redundancies and increase the overall usability of the fields available to other Splunk users on their team.
If you find that you need to create additional fields on top of the ones that Splunk automatically identifies for you, you have a number of ways to go about it. There are a number of Splunk Web methods that you can use for field extraction, but you can also add and manage extracted fields through Splunk's back end, by way of configuration file edits.
This topic provides a relatively brief overview of Splunk Web field extraction methods, and goes into greater detail about managing field extractions through configuration files.
For a detailed discussion of search time field addition using methods based in Splunk Web, see "Extract and add new fields" in the User manual. We'll just summarize the methods in this subtopic.
You can create custom search fields dynamically using the interactive field extraction (IFX) feature of Splunk Web. IFX enables you to turn any search into one or more fields. You use IFX on the local indexer. For more information about using IFX, see "Extract fields interactively in Splunk Web" in the User Guide.
Note: IFX is especially useful if you are not familiar with regular expression syntax and usage, because it will generate field extraction regexes for you (and enable you to test them).
To access IFX, run a search and then select or "Extract fields" from the dropdown that appears beneath timestamps in the field results. IFX enables you to extract only one field at a time (although you can edit the regex it generates later to extract multiple fields).
Splunk offers a variety of search commands that facilitate the extraction of fields at search time in different ways. Here is a list of these commands; for details of how they're used along with examples, see either the Search Reference or the "Extract and add new fields" topic in the User manual.
kv, for "key/value") search command forces field/value extraction on the result set. If you use extract without specifying any arguments, Splunk extracts fields using field extraction stanzas that have been added to props.conf. You can use extract to test any field extractions that you add manually through conf files.
$SPLUNK_HOME/etc/system/form/, or your own custom application directory in $SPLUNK_HOME/etc/apps/. For example, if form=sales_order, Splunk would look for a sales_order.form, and Splunk would match all processed events against that form, trying to extract values.
Splunk only accepts field names that contain alpha-numeric characters or an underscore:
Splunk applies the following rules to all extracted fields, whether they are extracted at index-time or search-time, by default or through a custom configuration:
1. All characters that are not in a-z, A-Z, and 0-9 ranges are replaced with an underscore (_).
2. All leading underscores are removed. Leading 0-9 characters can cause errors.
Many knowledge managers find it easier to manage their custom fields through configuration files, which can be used to add, maintain, and review libraries of custom field additions for their teams.
You add your search-time field extractions to props.conf, which you edit in $SPLUNK_HOME/etc/system/local/, or your own custom application directory in $SPLUNK_HOME/etc/apps/. (We recommend using the latter directory if you want to make it easy to transfer your data customizations to other search servers.)
Note: Do not edit files in $SPLUNK_HOME/etc/system/default/.
For more information on configuration files in general, see "About configuration files" in the Admin manual.
As you may already know, Splunk uses regular expressions, or regexes, to extract fields from event data. When you use IFX, Splunk attempts to generate regexes for you, but you can only perform one field extraction at a time. On the other hand, when you set up field extractions manually through configuration files, you have to provide the regex yourself--but you can set up regexes that extract two or more fields at once if necessary.
Important: The capturing groups in your regex must identify field names that contain alpha-numeric characters or an underscore:
1. All extraction configurations in props.conf are restricted by a specific source, sourcetype, or host. Start by identifying the sourcetype, source, or host that provide the events from which you would like your field to be extracted.
Note: For more information about overriding hosts and sourcetypes, see the "Work with hosts" and "Work with source types" chapters of this manual.
2. Determine a pattern to identify the field in the event.
3. Write a regular expression to extract the field from the event. For a primer on regular expression syntax and usage, see Regular-Expressions.info. You can test your regex by using it in a search with the rex search command. Splunk also maintains a list of useful third-party tools for writing and testing regular expressions.
4. Add your regex to props.conf, link and link it to the source, source type, or host that you identified in the first step.
5. If your field value is a portion of a word, you must also add an entry to fields.conf. See the example "create a field from a subtoken" below.
Edit the props.conf file in $SPLUNK_HOME/etc/system/local/, or your own custom application directory in $SPLUNK_HOME/etc/apps/.
Note: Do not edit files in $SPLUNK_HOME/etc/system/default/.
5. Restart Splunk for your changes to take effect.
Follow this format when adding a field extraction stanza to props.conf:
[<spec>] EXTRACT-<class> = <your_regex>
<spec> can be:
<sourcetype>, the source type of an event.
host::<host>, where <host> is the host for an event.
source::<source>, where <source> is the source for an event.
<class> is the extraction class. Precedence rules for classes:
source and a sourcetype, the class for source wins out.
../local/ for a <spec>, it overrides that class in ../default/.
<your_regex> = create a regex that recognizes your custom field value. The regex is required to have named capturing groups; each group represents a different extracted field.
Note: Unlike the procedure for adding to the set of indexed fields that Splunk extracts at index time, transforms.conf requires no DEST_KEY since nothing is being written to the index during search-time field extraction. Fields extracted at search time are not persisted in the index as keys.
Note: For search-time field extraction, props.conf uses EXTRACT-<class>, as opposed to TRANSFORMS-<value>, which is used for configuring index-time field extraction.
Here are a set of examples of manual field extraction, set up through configuration files.
This example shows how to create a new "error code" field. The field can be identified by the occurrence of device_id= followed by a word within brackets and a text string terminating with a colon. The field should be extracted from events related to the testlog sourcetype.
In props.conf, add:
[testlog] EXTRACT-<errors> = device_id=\[w+\](?<err_code>[^:]+)
This is an example of a field extraction that pulls out five separate fields. You can then use these fields in concert with some event types to help you find port flapping events and report on them.
Here's a sample of the event data that the fields are being extracted from:
#%LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet9/16, changed state to down
The stanza in props.conf for the extraction looks like this:
[syslog] EXTRACT-<port_flapping> = Interface\s(?<interface>(?<media>[^\d]+)(?<slot>\d+)\/(?<port>\d+))\,\schanged \sstate\sto\s(?<port_status>up|down)
Note that five separate fields are extracted as named groups: interface, media, slot, port, and port_status.
The following two steps aren't required for field extraction--they show you what you might do with the extracted fields to find port flapping events and then report on them.
Use tags to define a couple of event types in eventtypes.conf:
[cisco_ios_port_down] search = "changed state to down" tags = cisco ios port check status report success down [cisco_ios_port_up] search = "changed state to up" tags = cisco ios port check status report success up
Finally, create a saved search savedsearches.conf that ties much of the above together to find port flapping and report on the results:
[port flapping] search = eventtype=cisco_ios_port_down OR eventtype=cisco_ios_port_up starthoursago=3 | stats count by interface,host,port_status | sort -count
If your field value is a smaller part of a token, you must add an entry to field.conf. For example, your field's value is "123" but it occurs as "foo123" in your event.
Configure props.conf as explained above. Then, add an entry to fields.conf:
[<fieldname>] INDEXED = False INDEXED_VALUE = False
[url] if you've configured a field named "url."
INDEXED and INDEXED_VALUE to false.
You can disable search-time field extraction for specific sources, sourcetypes, or hosts through edits in props.conf. Add KV_MODE = none for the appropriate [<spec>] in props.conf.
[<spec>] KV_MODE = none
<spec> can be:
<sourcetype> - an event sourcetype.
host::<host>, where <host> is the host for an event.
source::<source>, where <source> is the source for an event.