Topics

| pdf version

Splunk > The IT Search Company

  • Search and navigate IT data from applications, servers and network devices in real-time.
  • Download Splunk

Localized Splunk documentation

Looking for Splunk documentation in other languages?

Configure rule-based source type recognition

This documentation applies to the following versions of Splunk: 4.0 , 4.0.1 , 4.0.2 , 4.0.3 , 4.0.4 , 4.0.5 , 4.0.6 , 4.0.7 , 4.0.8 , 4.0.9 , 4.0.10

Configure rule-based source type recognition

Configure rule-based source type recognition to expand the range of source types that Splunk recognizes. Splunk automatically assigns rule-based source types based on regular expressions you specify in props.conf.

You can create two kinds of rules in props.conf: rules and delayed rules. The only difference between the two is the point at which Splunk checks them during the source typing process. As it processes each string of event data, uses several methods to determine source types Splunk uses several methods to determine source types:

  • After checking for explicit source type definitions based on the event data input or source, Splunk looks at the rule:: stanzas defined in props.conf and tries to match source types to the event data based on the classification rules specified in those stanzas.
  • If Splunk is unable to find a matching source type using the available rule:: stanzas, it tries to use automatic source type matching, where it tries to identify patterns similar to source types it has learned in the past.
  • When that method fails, Splunk then checks the delayedrule:: stanzas in props.conf, and tries to match the event data to source types using the rules in those stanzas.

You might set your system up so that delayedrule:: stanzas contain classification rules for generic source types, while rule:: stanzas contain classification rules for more specialized ones. For example, you could use rule:: stanzas to catch event data with specific syslog source types, such as "sendmail_syslog" or "cisco_syslog" and then have a delayedrule:: stanza apply the generic "syslog" source type to remaining syslog event data.


Configuration

To set source typing rules, edit props.conf in $SPLUNK_HOME/etc/system/local/, or your own custom application directory in $SPLUNK_HOME/etc/apps/. For more information on configuration files in general, see "About configuration files" in the Admin manual.

Create a rule by adding a rule:: or delayedrule:: stanza to props.conf. Provide a name for the rule in the stanza header, and declare the source type name in the body of the stanza. After the source type declaration, list the the source type assignation rules. These rules use one or more MORE_THAN and LESS_THAN statements to find patterns in the event data that fit given regular expressions by specific percentages.

Note: You can specify any number of MORE_THAN and LESS_THAN statements in a source typing rule stanza. All of the statements must match a percentage of event data lines before those lines can be assigned the source type in question. For example, you could define a rule that assigns a specific source type value to event data where more than 10% match one regular expression and less than 10% match another regular expression.

Add the following to props.conf:

[rule::$RULE_NAME] OR [delayedrule::$RULE_NAME]
sourcetype=$SOURCETYPE
MORE_THAN_[0-100] = $REGEX
LESS_THAN_[0-100] = $REGEX

The MORE_THAN and LESS_THAN numerical values refer the percentage of lines that contain the string specified by the regular expression. To match, a rule can be either MORE_THAN or LESS_THAN those percentages.

Note: For a primer on regular expression syntax and usage, see Regular-Expressions.info. You can test regexes by using them in searches with the rex search command. Splunk also maintains a list of useful third-party tools for writing and testing regular expressions.

Examples

The following examples come from $SPLUNK_HOME/etc/system/default.

Postfix syslog files

# postfix_syslog sourcetype rule
[rule::postfix_syslog]
sourcetype = postfix_syslog
# If 80% of lines match this regex, then it must be this type
MORE_THAN_80=^\w{3} +\d+ \d\d:\d\d:\d\d .* postfix(/\w+)?\[\d+\]:

Delayed rule for breakable text

# breaks text on ascii art and blank lines if more than 10% of lines have
# ascii art or blank lines, and less than 10% have timestamps
[delayedrule::breakable_text]
sourcetype = breakable_text
MORE_THAN_10 = (^(?:---|===|\*\*\*|___|=+=))|^\s*$
LESSS_THAN_10 = [: ][012]?[0-9]:[0-5][0-9]
Revision: 207 Contact Privacy Policy Terms of Use Community content licensed under Creative Commons