Documentation: 3.4.1
Print Version Contents
This page last updated: 11/26/08 01:11pm

Extract

Use extracting commands to extract data from raw events so you can analyze it and and build reports in a meaningful way.

addinfo

Summary indexing uses the addinfo command to add fields containing general information about the current search to events going into a summary index. You can also use | addinfo in any search to add general information (about the current search) to the search results. This is useful if you want to build and test searches and reports on search results before using summary indexing.

Currently, addinfo adds the following fields to each result:

  • info_min_time: The earliest time bound of the search.
  • info_max_time: The latest time bound of the search.
  • info_search_id: The query_ID of the search that generated the event.
  • info_search_time: The search execution time.

Note: The fields that addinfo adds are defined in savedsearches.conf. Currently, you can't customize the fields addinfo adds.

Syntax

addinfo

Arguments

None.

Examples

Splunk Web:
This example searches Web server data and builds a report based on client IPs. It then adds fields containing general search information to the search results, returns a list sorted by unique IP addresses and by what search each event came from (query_ID).

eventtype=banner_access NOT eventtypetag=bot | stats distinct_count(clientip) as uniqueIPs, max(_time), min(_time) | eval site="update_banners" | addinfo | sort uniqueIP, info_search_idSearch

This example searches Web server data for raw downloads and adds global data to the search results.

"eventtypetag=download" NOT eventtypetag=bot NOT eventtypetag=internal | addinfoSearch

extract (kv)

This data-processing command extracts key/value pairs from search results. It takes the key/value pairs that are present in the search string and inserts them as reportable fields into the event. Use extract to extract data from your search results using transform stanza names you've created in transforms.conf.

Note: Use extract to test new regular expression rules you add in transforms.conf.

Syntax

extract [extract-options] transform_stanza_names

Note: You can use kv in place of extract.

Arguments

extract-options
extract-options auto | reload | limit | maxchars | kvdelim | pairdelim Options to tune how your key-value extraction performs.
auto auto=T | F (T) If set, specifies automatic '=' based extraction.
reload reload=T | F (F) If set, forces the reloading of props.conf and transforms.conf.
limit limit=integer (50) Specifies the number of key/value pairs to extract.
maxchars maxchars=integer (10240) Specifies the maximum number of characters to look into a single event.
kvdelim kvdelim=string A comma-separated list of character delimiters that will be used to separate keys from values.
pairdelim pairdelim=string A comma-separated list of character delimiters that will be used to separate key/value pairs from one another.
transform_stanza_names name of stanza(s) A stanza in transforms.conf. Specify a transform that's configured in props.conf.

Examples

Splunk Web:

Search all events, and extract key/value pairs while reloading settings from disk.

* | extract reload = trueSearch

Search the local host for all events and extract key/value pairs that are delimited by "|;", and key/values that are delimited by "=:". Return a report of the top occurring values of the search.

host=localhost | kv pairdelim="|;", kvdelim="=:", auto=f | top field1Search

CLI:

Search all indexed data and reload the extracted field settings to apply configuration changes in configuration files.

./splunk search "* | extract reload = true"

iplocation

This data-processing command searches for IP addresses in the raw event data. The processor then looks up the IP address physical location using the "hostip.info" database and extracts and outputs the IP addresses with associated city/country based on the database's information.

Syntax

iplocation [max-inputs]

Arguments

max-inputs maxinputs=integer Set the maximum number of events that iplocation will process.

Examples

Splunk Web:

This example searches for 404 errors on the host webserver1. Then takes the first 20 results found, and determines if IP addresses are found, and outputs the IP addresses with location data for each result.
404 host=webserver1 | head 20 | iplocationSearch

NOTE: You need internet access for the lookup to occur.

multikv

This data-processing command extracts key/value pairs from multi-line events. multikv extracts key/value pairs just like extract does, but handles events that are multi-lined, or are in tabular format.

For tabular-formatted events, a new event is created for each table row. Field names are derived from the title row of the table.

Syntax

multikv [multikv-option]...

Arguments

multikv-option
multikv-option copyattrs | fields | filter | forceheader | multitable | noheader | rmorig | maxnewresults Options available for multikv processing.
copyattrs copyattrs=T | F (T) If set, turns on the copying of non-field attributes from the original event to extracted events.
fields fields field1,field2,... Space or comma-separated list of fields to include in extracted multikv extracted events. Fields not included are filtered out.
filter filter field1,field2,... Space or comma-separated list of fields. A table-row must contain one of the fields in the list in order to be extracted into an event during multikv processing.
forceheader forceheader=line number(integer) Allows you to specify a line number to be the table's header.
multitable multitable=T | F (T) If set, enables multiple tables to be able to be in a single _raw entry.
noheader noheader=T | F (F) If set, allows tables with no header. If not set, fields are named: column1, column2,...
rmorig rmorig=T | F (T) If set, removes the original events from the result set.
maxnewresults maxnewresults=integer (default=50000) Set the limit for the number of single-line results for multikv to process from multi-line events passed to it.

Examples

Splunk Web:

This example extracts the COMMAND field only when it occurs in rows that contain "splunkd".

multikv fields COMMAND filter splunkdSearch

CLI:

This example is the CLI version of the example above.

./splunk search "* | multikv fields COMMAND filter splunkd"

rex

This data-processing command uses Perl regular expression named groups to extract fields while you search. Use rex to extract fields that aren't extracted at index time. You can also use rex to experiment with extracted fields before you choose to index them.

Syntax

rex field regular expression

Arguments

field field=field (default=_raw) Field to perform the regular expression on (the default field is _raw).
regular expression "string" | string A PCRE (Perl Compatible Regular Expression) supported by the pcre library to match field values to.

Examples

Splunk Web:

This example searches for all events from sources that match the sourcetype "mailserver", then extracts two fields from the field _raw (_raw = all event data) using two named groups. The first named group matches text in each result that follows "From:", and stores the value in the field "from" (designated by: <from>). The second named group matches text in each result that follows "To:", and stores the value in the field "to" (designated by: <to>). If _raw was "From: Susan To: Bob", "Susan" and "Bob" would be extracted into the "from" and "to" fields.

sourcetype=mailserver | rex field=_raw "From: (?<from>.*) To: (?<to>.*)"Search

This example uses rex to extract fields out of strace data to help see what calls are being made, how long they are taking, and what the return values are for each. Piping to stats produces a report that shows what was called, how many times, and shows the longest and median times it took to make the calls. In the rex portion of the search, the fields rv, width, and syscall are extracted.

sourcetype="strace" | rex "(?<syscall>\S+(.*=(?<rv>.*) <(?<width>\S+)>" | | stats max(width) median(width) count by syscallSearch

Note: If you are using rex within a subsearch, the outer search will not get the created field. You need to convert the fields into extracted fields.

typer

This data-processing command calculates the eventtype field for search results that match a known event type. You do not have to use this command in Splunk Web. Splunk Web automatically calculates eventtype fields for search results.

Syntax

typer

Arguments

None.

Examples

Splunk Web:

This example searches all events, displays the top 10 events, applies event types based on those defined in eventtypes.conf, and displays them in Splunk Web.

* | top limit=10 field1 | typerSearch

CLI:
This example is the CLI version of the above example. outputraw tells Splunk to output the raw events to the CLI screen.

./splunk search "* | top limit=10 field1  | typer | outputraw"

xmlkv

This data-processing command finds all key/value pairs of the form bar, where foo is the key and bar is the value from the raw data. This is useful in finding key-value pairs in xml-formatted data (such as transactions from webpages).

<key>value</key>

Syntax

xmlkv

Arguments

None.

Examples

Splunk Web:

This example searches for incomplete orders in the index "metaevents". Then it matches key-value pairs that are in xml format. It sets the key to the value in the tags, and the value between the tags as the value of the pair.

NOT Completed orderId=* index="metaevents" | xmlkvSearch

CLI:

This example searches for incomplete orders in the index "metaevents". Then it matches key-value pairs that are in xml format. It sets the key to the value in the tags, and the value between the tags as the value of the pair.

./splunk search "(NOT Completed) orderId=* index="metaevents" | xmlkv"

Previous: Filter and re-order    |    Next: Evaluate

Comments

No comments have been submitted.

Log in to comment.