Use transforming commands to mine your data by transforming values, manipulating fields, or by creating new results from existing data in your results.
Use reporting commands to produce reports and summarize your search results.
associateThis data-processing command identifies relationships between pairs of fields. It compares an event's field/value pair with a reference field/value pair (user-specified).
Note: You must be in the report mode of Splunk Web for associate to render correctly.
Syntaxassociate [associate-option]...
Argumentsassociate-option
| associate-option | action-option | supcnt-option | subfreq-option | improv-option | Associate command options. |
| supcnt-option | supcnt=integer(100) | Specifies the minimum number of times a reference field value pair must appear to be considered an associate. |
| subfreq-option | supfreq=number(0.1) | Specifies the minimum frequency of reference key/value combinations, expressed as a fraction of the number of total number of results. |
| improv-option | improv=number(0.5) | Sets the value of the reference key/value pair that other pairs must be greater than to be associated. |
Splunk Web:
This example searches the access source types and displays the events that are associated with each other that have at least 3 references to each other.
This data-processing command returns events in a tabular output suitable for charting (it does not have the x-axis designated as "time"). Chart creates a table with an arbitrary field as the x-axis (this is different from timechart, which generates a chart with _time as the x-axis). Chart fields are automatically converted to numerical values if necessary. Chart is automatically called during report on specific stat specifiers.
Syntaxchart [stat-operator]... by x-axis-field [bucketing options]
Arguments| x-axis-field | field,field,... | Specified fields for the x-axis. |
stat-operator
| stat-operator | count | distinct_count | first | last | sum | min | max | avg | mean | mode | median | stdev | var | percXX | Specifies the statistical operation to perform. |
| count | c | count|c(field) | Find the count of values in the specified field(s). |
| distinct_count | dc | distinct_count|dc(field) | Find the count of distinct values in the specified field(s). |
| first | first | Show the first "seen" value of a field. |
| last | last | Show the last "seen" value of a field. |
| sum | sum[(field)] | Produce the sum of the values of the field. |
| min | min(field) | FInd the minimum value of values in the specified field(s). |
| max | max(field) | Find the maximum value of values in the specified field(s). |
| avg | avg(field) | Find the average value of values in the specified field(s). |
| mean | mean(field) | Find the mean value of values in the specified field(s). |
| mode | mode(field) | Find the mode value of values in the specified field(s). |
| median | median(field) | Find the median value of values in the specified field(s). |
| stdev | stdev(field) | Find the standard deviation of values in the specified field(s). |
| var | var(field) | Find the variance of values in the specified field(s). |
| percXX | percXX | Percentile, integer between 1 and 99 |
bucketing-option
| bucketing-option= | bins | span | type | fixedrange | cont | start | end | length | Discretization options. |
| bins | bins=integer(20) | Sets the maximum number of discrete bins to build. If using the _time field, the default=300. |
| span | span=integer span-length | Sets the size of each bucket. Example =span=10 or span=2d or span=5m |
| type | type=(TIME | INT | NUM | CAT | AUTO) (AUTO) | Specifies the type of value in the field that is being discretized. Manually specify how sets are discretized.TIME = Time-based discretization. INT = Integer number discretization. NUM = Arbitrary number discretization. CAT = Categorical discretization. AUTO = Automatically diagnosed discretization. |
| fixedrange | fixedrange=T | F(T) | Applicable if bucketing by time. Setting to T causes the search-time boundaries to be used. |
| cont | cont=T | F (T) | When set, causes empty continuity bins to be added to the x-axis to make it uniform. |
| start | start=integer | Sets the minimum for numerical buckets. |
| end | end=integer | Sets the maximum for numerical buckets. |
| length | length=integer span-length | If using a timescale, specifies the time range. If not, specifies the absolute bucket length. |
span-length
| span-length | ts-sec | ts-min | ts-hr | ts-day | ts-month | Time scale units |
| ts-sec | s | sec | secs | second | seconds | Time scale in seconds. |
| ts-min | m | min | mins | minute | minutes | Time scale in minutes. |
| ts-hr | h | hr | hrs | hour | hours | Time scale in hours. |
| ts-day | d | day | days | Time scale in days. |
| ts-month | mon | month | months | Time scale in months. |
Splunk Web:
This example searches all events, then returns a chart that is the average of all the sizes plotted against the name of the host.
This example searches for hits referred by google, then charts the count by hour of the day on the x-axis and day of the week as series.
sourcetype=access_combined referer_domain=http://www.google.com/ | chart count by date_hour, date_wdayCLI:
This example gets the average (mean) size for each distinct host.
./splunk search "* | chart avg(size) by host"
This example gets the max delay by size, where size is broken down into up to 10 equal sized buckets.
./splunk search "* | chart max(delay) by size bins=10"
This data-processing command clusters events together based on their similarity to each other and represents that cluster with a single event. Use cluster to reduce a search with large number of similar events to fewer clusters that are much more manageable to view. This is useful if you want to find the most common or rarest events in your data.
How cluster works:
Splunk creates clusters by comparing events using the data in a field that you specify. Specify a field to compare using the field option (default field = _raw). Data in the field is broken into chunks for comparison. You can change how data is broken up by specifying delimiters (By default, every character except: 0-9, A-Z, a-z, and '_' are delimiters). Splunk evaluates the chunks of data in each event, and then compares them with a representative event from each cluster. If an event matches an existing cluster, it becomes a part of that cluster. If an event doesn't match a cluster, it starts a new cluster and becomes the representative event for that cluster.
You can change the threshold Splunk uses to determine how similar events must be to match in a cluster (from 0.0 to 1.0, default = 0.8). Set a higher threshold to create more clusters, and a lower to create fewer. The higher the threshold, the more events must match to be a part of the same cluster.
When you apply cluster, your search results are reduced to display a single representative event for each cluster of events. If you want to retain your original event data and only label what cluster events belong to, set the labelonly option to TRUE (T).
Syntaxcluster [cluster-options]...
Arguments| cluster-options | threshold | delimiters | showcount | countfield | labelfield | field | labelonly | Options to configure clustering. |
| threshold | T=number 0.0-1.0 (0.8) | Set the threshold to specify how closely events must match in order to be clustered. Setting closer to 1 means that events have to be more similar to be in the same cluster. |
| delimiters | delims=character list | Specify the delimiters to separate tokens in clusters with. By default, every character except: 0-9, A-Z, a-z, and '_' are delimiters. Specify a space-delimited list of delimiters to override the default setting. |
| showcount | showcount=(T | F) (T) | Specify whether to show the size of each cluster. Default is TRUE (T). If labelonly is set to TRUE, then the size will not be shown. |
| countfield | countfield=field name (cluster_count) | Specify the name of a field to write the cluster size to. |
| labelfield | labelfield=field name (cluster_label) | Specify the name of a field to write the cluster number to. |
| field | field=field name (_raw) | Specify the name of a field to analyze for clustering. The default is _raw. |
| labelonly | labelonly=(T | F) (T) | If set to true, will not reduce clusters to a single event per cluster. Will instead, keep original event data, and label each event with their cluster number. |
Splunk Web:
This example returns the 20 most common clusters of events. First, it searches for syslog events that don't have the term "juniper". Then clusters the events and sorts the clusters by cluster_count. The Results returned will be the first 20 events, which are the 20 largest clusters (in data size).
This data-processing command builds a contingency table for two fields. Contingency tables are useful to record and analyze the relationship between two or more variables (in Splunk's case - fields). Useful statistical analysis such as calculation of the phi coefficient or Cramer's V is possible from a contingency table.
Syntaxcontingency [contingency-options]... field field
Argumentscontingency-options
| contingency-options= | maxopts | mincover | usetotal | totalstr | Options for specifying a contingency table. |
| maxopts | (maxrows= | maxcols=)integer(0) | Specifies the maximum number of rows or columns. If the number of distinct values exceeds the specified maximum, then the least common values are ignored. Specifying a value of 0 sets the maximum to unlimited. |
| mincover | (mincolcover= | minrowcover=)number(1.0) | Specifies the percentage of values for a row or column to cover. |
| usetotal | usetotal=(T | F)(T) | If set, adds the row and column totals together. |
| totalstr | totalstr=field("Total") | Specify the field to place the row and column totals. |
Splunk Web:
This example searches all events and builds a contingency table for datafield1 & 2. Sets the maximum rows and columns to 5, and does not allow the rows and columns to add together.
This data-processing command calculates the correlation between different fields.
Syntaxcorrelate [correlate-type]...
Arguments| correlate-type | type=cocur | Specifies the type of correlation to calculate. Currently only the co-currence calculation is supported. Co-currence is the percentage of times that two fields exist in the same results. |
Splunk Web:
This example searches all events, and calculates the co-currence correlation between all fields.
This data-processing command compares the data of two search results and returns a single result that is the difference between the values compared. You can compare values of specific fields of results by using the attribute argument (by default the value of the _raw field is compared).
Syntaxdiff result1 result2 [attribute] [header] [context]
Arguments| result1 | pos1=integer(default = 1st result) | Number of the first search result to compare. |
| result2 | pos2=integer(default = 2nd result) | Number of the second search result to compare. |
| attribute | attribute=field name(none=_raw) | Specify a specific field value to compare (if left blank, compares the _raw field). |
| header | header=(T | F)(default=F) | If set, displays a legend for the output of diff. |
| context | context=(T | F)(default=F) | If set, displays context lines around the diff result. |
Splunk Web:
This example compares the raw text of result 45 and result 2 (because result2 is blank).
CLI:
This example compares the top and 3rd results' hosts.
./splunk search "* | diff 1 3 attribute=host"
This data-processing command takes results of a subsearch and formats them into a single result (single result with an attribute value of: _query) that is a query built from the input search results. This is so they can be applied to another search (useful for subsearches). Six strings are needed to define row prefix, column prefix, column separator, column end, row separator, and row end. If no argument is specified, the default values are used.
Syntaxformat row-prefix column-prefix column-separator column-end row-separator row-end
Arguments| row-prefix | character( ( ) | Specifies the character used for the row prefix. |
| column-prefix | character( ( ) | Specifies the character used for the column prefix. |
| column-separator | character( AND ) | Specifies the character used for the column separator. |
| column-end | character( ) ) | Specifies the character used for the column end. |
| row-separator | character( OR ) | Specifies the character used for the row separator. |
| row-end | character( ) ) | Specifies the character used for the row end. |
Splunk Web:
This example gets results that contain "/doc" and creates a search from their host, source and source type. Using a hypothetical set of data, this will return:This can also be used in a subsearch as follows:
This subsearch finds all events that contain "will" from the source type and host of each.
CLI:
This is the CLI version of the first example.
./splunk search "/doc | fields + source, sourcetype, host | format | outputraw"
This data-processing command allows you to highlight one or more strings of text in your search results by specifying those strings in a list.
Syntaxhighlight string,[string],...
Arguments| string | string | string,...,string | Specify a comma or space-delimited list of strings you want to highlight. |
Splunk Web:
This example searches all sources that are a webserver sourcetype, and highlights the terms "login" and "logout".
This data-processing command displays the least common values of a field, along with a count and percentage.
Syntaxrare[option]... field list
Argumentsoption
| option | showcount | showperc | rare | limit | Options for rare. |
| showcount | showcount=(T | F) (default=T) | If set, creates a field called "count" that holds the count. |
| showperc | showperc=(T | F) (default=T) | If set, creates a field called "percent" that holds the percentage of prevalence of values. |
| limit | limit=integer (default=10) | Specifies how many values appear. Setting to "0" causes all values to be returned. |
| field list= | field,field,... | Comma-separated list of fields to include. |
Splunk Web:
This example Displays the least common values of the url field.
CLI:
This example displays the 20 least common values for the url field.
./splunk search "* | rare limit=20 url"
This data-processing command provides summary statistics, grouped optionally by field. Returns one result for each aggregated group. If there is no "by" argument, there will be only one returned result. If there is a "by" argument with a single field, there will be a returned result for every distinct value of the field. If there is a "by" argument with several fields, there will be a returned result for every distinct tuple of values for the fields. Each result contains all the "by" fields, as well as a field for each aggregator argument.
Syntaxstats [stat-operator [as new-field-name] ]... [by groupby-field(s)]
Arguments| groupby-fields | field,field,... | Specifies the fields to group events by. One result is returned per distinct combination of values of the fields. |
stat-operator
| stat-operator | count | distinct_count | first | last | sum | min | max | avg | mean | mode | median | stdev | var | percXX | Specifies the statistical operation to perform. |
| count | c | count|c(field) | Find the count of values in the specified field(s). |
| distinct_count | dc | distinct_count|dc(field) | Find the count of distinct values in the specified field(s). |
| first | first | Show the first "seen" value of a field. |
| last | last | Show the last "seen" value of a field. |
| sum | sum[(field)] | Produce the sum of the values of the field. |
| min | min(field) | FInd the minimum value of values in the specified field(s). |
| max | max(field) | Find the maximum value of values in the specified field(s). |
| avg | avg(field) | Find the average value of values in the specified field(s). |
| mean | mean(field) | Find the mean value of values in the specified field(s). |
| mode | mode(field) | Find the mode value of values in the specified field(s). |
| median | median(field) | Find the median value of values in the specified field(s). |
| stdev | stdev(field) | Find the standard deviation of values in the specified field(s). |
| var | var(field) | Find the variance of values in the specified field(s). |
| percXX | percXX | Percentile, integer between 1 and 99 |
Splunk Web:
This example searches the access logs, and reports the count of the number of hits from the top 100 referer domains.
CLI:
For each unique time, this example gives your the average of any unique field that ends with the the string 'lay' (e.g. delay, xdelay, relay, etc).
./splunk search "* | stats avg(*lay) BY _time"
This data-processing command allows you to combine any number of field values and strings of text together to create more meaningful data from your search results. For example, you can use strcat to combine the source and destination IP address fields in your search results to create a chart of IP address pairings.
strcat [required] sources destination
Arguments| required | allrequired=(T | F) (F) | If set to true (T), requires that all of the source fields exist for a given event to write out the destination field. By default it is set to false (F). |
| sources | ("string" | field name) ... ("string" | field name) | A space-delimited list of strings or fields to combine together. Strings are combined in the same order they are listed. |
| destination | field name | Name of the field to store the combined value in. This is the last field listed in a strcat declaration. |
Splunk Web:
This example searches for all data from "access" sourcetype, then combines the host field with "::" and the port field. The combined strings are stored in the last field listed: address (values will be: host::port).
This data-processing command is used to create a chart for a statistical aggregation applied to a specified field (using time as the x-axis). Optionally split data by a field so that each distinct value of a split-by field is a series.
When called without specifying a bucketing-option, timechart assumes that bins=300.
Syntaxtimechart [bucketing-option]... stat operator [ timechart-option (where-clause)]
Argumentsbucketing-option
| bucketing-option | bins | span | type | fixedrange | cont | start | end | length | Discretization options. |
| bins | bins=integer(20) | Sets the maximum number of discrete bins to build. If using the _time field, the default=300. |
| span | span=integer span-length | Sets the size of each bucket. Example =span=10 or span=2d or span=5m |
| type | type=(TIME | INT | NUM | CAT | AUTO) (AUTO) | Specifies the type of value in the field that is being discretized. Manually specify how sets are discretized.TIME = Time-based discretization. INT = Integer number discretization. NUM = Arbitrary number discretization. CAT = Categorical discretization. AUTO = Automatically diagnosed discretization. |
| fixedrange | fixedrange=T | F(T) | Applicable if bucketing by time. Setting to T causes the search-time boundaries to be used. |
| cont | cont=T | F (T) | When set, causes empty continuity bins to be added to the x-axis to make it uniform. |
| start | start=integer | Sets the minimum for numerical buckets. |
| end | end=integer | Sets the maximum for numerical buckets. |
| length | length=integer span-length | If using a timescale, specifies the time range. If not, specifies the absolute bucket length. |
stat-operator
| stat-opeartor | count | distinct_count | first | last | sum | min | max | avg | mean | mode | median | stdev | var | percXX | Specifies the statistical operation to perform. |
| count | c | count|c(field) | Find the count of values in the specified field(s). |
| distinct_count | dc | distinct_count|dc(field) | Find the count of distinct values in the specified field(s). |
| first | first | Show the first "seen" value of a field. |
| last | last | Show the last "seen" value of a field. |
| sum | sum[(field)] | Produce the sum of the values of the field. |
| min | min(field) | FInd the minimum value of values in the specified field(s). |
| max | max(field) | Find the maximum value of values in the specified field(s). |
| avg | avg(field) | Find the average value of values in the specified field(s). |
| mean | mean(field) | Find the mean value of values in the specified field(s). |
| mode | mode(field) | Find the mode value of values in the specified field(s). |
| median | median(field) | Find the median value of values in the specified field(s). |
| stdev | stdev(field) | Find the standard deviation of values in the specified field(s). |
| var | var(field) | Find the variance of values in the specified field(s). |
| percXX | percXX | Percentile, integer between 1 and 99 |
timechart-option
| timechart-option | bucketing-option | usenull | useother | nullstr | otherstr | These options change the behavior of timechart when splitting by a field. |
| usenull | usenull=T | F(T) | If set, usenull will create a series for events that do not contain the specified split-by field. The series created is labeled by the value of the nullstr option (the default label is "NULL"). |
| useother | useother=T | F(F) | If set, useother causes a series to be added for data not included in the timechart. |
| nullstr | nullstr=string | Specifies the value of the label of the null string. |
| otherstr | otherstr=string | Specifies the value of the label of the other string. |
where-clause
| where-clause | ||
| where-comparison | (in | notin) (top | bottom) integer | Specifies the criteria for including a data series when a field is given in the timechart-option clause |
| Examples of where-comparison usage: | in top5 | in bottom10 | notin top2 |
span-length
| span-length | ts-sec | ts-min | ts-hr | ts-day | ts-month | Time scale units |
| ts-sec | s | sec | secs | second | seconds | Time scale in seconds. |
| ts-min | m | min | mins | minute | minutes | Time scale in minutes. |
| ts-hr | h | hr | hrs | hour | hours | Time scale in hours. |
| ts-day | d | day | days | Time scale in days. |
| ts-month | mon | month | months | Time scale in months. |
Splunk Web:
This example gets all output from sources that are the ps sourcetype, converts the tabular output of into one event per line and extracts fields based on the headers using the multikv command, and calculates the average of CPU for each 1 minute time span for each host.
CLI:
This example graphs the average thruput over time, with time sets differentiated by 5 minute spans.
./splunk search "* | timechart span=5m avg(thruput) by host"
This data-processing command displays the most common values of a field, along with a count and percentage.
Syntaxtop [option]... field list
Argumentsoption
| showcount | showcount=T | F (T) | If set, creates a field called "count" that holds the count. |
| showperc | showperc=T | F (T) | If set, creates a field called "percent" that holds the percentage of prevalence of values. |
| limit | limit=number(10) | Specifies how many values appear. Setting to "0" causes all values to be returned. |
| field list | field1,field2,... | Comma-separated list of fields to include. |
Splunk Web:
This example displays the most common 10 values of the url field.
CLI:
This example displays the most common 20 values of the url field.
./splunk search "* | top limit=20 url"
This data-processing command takes the results of a search and groups related events into transactions. This allows you to apply a pre-defined transaction to your search, or define specifications to create transactions during your search. You can use transaction with any search.
Transactions that are returned consist of: the raw text of each event, the shared event types, and the field values.
Use macro search with transactions to create transactions that with macro substitution.
Syntaxtransaction [name] [transaction-options]...
Arguments| name | string | Name of the transaction definition as defined in transactions.conf. If you specify a transaction name, you can override any attribute/value pairs you have set for that transaction by explicitly listing them in your search. |
transaction-options
| transaction-options | maxspan | maxpause | fields | aliases | pattern | match | start | end | Optional constraints to specify for transaction processing. |
| aliases | aliases=(A | B | C) | Specify a list of aliases to use with the pattern option. Defaults: A=login, B=purchase, C=logout. You can't use start and end options when using aliases. |
| fields | fields=[field], [field],...(" ") | Specifies a list of fields that each transaction must have the same value of. For example: if "host" is a constraint, then a search result that has "host=mylaptop" can never be in the same transaction as a search result with "host=myserver". A search result that has no "host" value can be in a transaction with a result that has "host=mylaptop". |
| match | match=closest(closest) | Specifies the matching type to use with a transaction definition. The only value supported currently is: closest. |
| maxspan | maxspan=integer[s | m | h | d] | Specifies the constraint for the maximum span that a transaction can be. |
| maxpause | maxpause=integer[s | m | h | d] | Specifies the maximum pause between transactions. |
| pattern | pattern=regular expression | Defines a pattern of event types to be included in a transaction. |
| start | startswith="string" | Specify a SQLite expression that must be true to begin a transaction. Strings must be quoted with " ". You can use SQLite wildcards (%) and use single quotes(' ') to specify a literal term. |
| end | endswith="string" | Specify a SQLite expression that must be true to end a transaction. Strings must be quoted with " ". You can use SQLite wildcards (%) and use single quotes(' ') to specify a literal term. |
Note: Use escaped quotes (\") when you specify values that contain quotes for the start and end options.
For a start value of attr="value":
Note: The transaction command should not be used when you want to compute aggregate statistics over transactions defined by a unique identifier. For example, if you want to find the "longest" transactions, where the field "trade_id" defines the transactions, the following search is far more efficient:
* | stats min(_time) as earliest max(_time) as latest by trade_id | eval duration = latest - earliest | sort -durationSplunk Web:
This example searches for transactions that have a maximum span of 30 seconds, have a pause between transactions no greater than 5 seconds, and have matching from fields. For example, this search will return all events from the same sender occurring within 30 seconds of each other.
This data-processing command generates a list of queries based on search results, to use as event types. It will create a "search=..." field in your results that contains a search for keywords and a punctuation pattern associated with that event.
Syntaxtypelearner
ArgumentsNone.
ExamplesSplunk Web:
This example searches all events, takes the last 20 events, and applies the event type learner. The event type learner will add a field to the results that contains a search that searches for keywords and the punctuation type for each event.
This data-processing command unescapes the XML entity references (for: &, >, and <) back to their corresponding characters in your search results. You can specify how many search results to unescape XML from by using the max-inputs argument.
Syntaxxmlunescape [max-inputs]
Arguments| max-inputs | maxinputs=integer (100) | Sets how many results (starting from the top) are passed to xmlunescape. |
Splunk Web:
This example searches for all events from the source "xml_escaped", then unescapes XML characters for &, >, and < in all search results.
Comments
No comments have been submitted.