Documentation: 3.2.1
Print Version Contents
This page last updated: 05/06/08 12:05pm

props.conf

props.conf controls what parameters apply to events during indexing based on settings tied to each event's source, host, or sourcetype.

NOTE: Use wildcards in your props.conf <spec> (much the same as in inputs.conf). You can only use wildcards for host or source. Use ... for paths and * for files.

  • ... recurses through directories until the match is met. This means that /foo/.../bar will match foo/bar, foo/1/bar, foo/1/2/bar, etc. but only if bar is a file.
    • To recurse through a subdirectory, use another .... For example /foo/.../bar/....
  • * matches anything in that specific path segment. It cannot be used inside of a directory path; it must be used in the last segment of the path. For example /foo/*.log matches /foo/bar.log but not /foo/bar.txt or /foo/bar/test.log.
  • Use * and ... together for more powerful matches:
    • foo/.../bar/* matches any file in the bar directory within the specified path.

props.conf.spec

# This file contains possible attribute/value pairs for configuring Splunk's processing properties
# via props.conf.
#
# There is a props.conf in $SPLUNK_HOME/etc/bundles/default/.  To set custom configurations, 
# place a props.conf in your own custom bundle directory.
#
# For help creating a bundle directory, or to learn more about bundles (including bundle precedence)
# please see the documentation located at http://www.splunk.com/doc/latest/admin/bundleconfig.

[<spec>]
    * This stanza enables properties for a given <spec>. 
     * A props.conf file can contain multiple stanzas with different <specs>.
     * Follow this stanza name with any number of the following attribute/value pairs.
     * If you do not set an attribute for a given <spec> the default is used.

<spec> can be:
1. <sourcetype>, the sourcetype of an event.
2. host::<host>, where <host> is the host for an event.
3. eventtype::<eventtype> where <eventtype> is any valid event type that is either pre-defined in 
    Splunk or defined in eventtypes.conf.
    NOTE: eventtype can only be used as a spec for creating extracted fields with REPORT<class>.
4. source::<source>, where <source> is the source for an event.
5. rule::<rulename>, where <rulename> is a unique name of a sourcetype classification rule.
6. delayedrule::<rulename>, where <rulename> is a unique name of a
   delayed sourcetype classification rule.  These are only considered
   as a last resort before generating a new sourcetype based on the
   source seen.

NOTE: When specifying a <spec>, you can use the following regex-type syntax:

... = will recurse through directories until the match is met.

* = matches anything but / 0 or more times.

| = or 

( ) = used to limit scope of |.

Example: [source::....(?<!tar.)(gz|tgz)] 

#******************************************************************************
# The possible attributes/value pairs for props.conf, and their default values, are:
#******************************************************************************

# International characters

CHARSET = <string>
      * When set, Splunk will assume the input from the given <spec> is in the specified encoding.  
      * A list of valid encodings can be retrieved using the command "iconv -l" on most *nix systems.  
      * If an invalid encoding is specified, a warning is logged during initial configuration 
      and further input from that <spec> is discarded.  
      * If the source encoding is valid, but some characters from the <spec> are not valid in the
      specified encoding, then the characters will be escaped as hex (e.g. "\xF3").
      * Defaults to ASCII.
      * When set to "AUTO", Splunk will attempt to automatically determine the character encoding and 
      convert text from that encoding to UTF-8.  
      * For a complete list of the some 20 character sets that Splunk automatically detects, see the 
      online documentation.

#******************************************************************************
# Line breaking
#******************************************************************************

TRUNCATE = <non-negative integer>
      * Change the default maximum line length.  
      * Set to 0 if you do not want truncation ever (very long lines are, however, often a sign of 
      garbage data).
    * Defaults to 10000.

LINE_BREAKER = <regular expression>
      * If not set, the raw stream will be broken into an event for each line delimited by \r or \n. 
    * If set, the given regex will be used to break the raw stream into events.
    * The regex must contain a matching group. 
    * Wherever the regex matches, the start of the first matched group is considered the first text NOT in the
    previous event. 
    * The end of the first matched group is considered the end of the delimiter and the next 
    character is considered the beginning of the next event. 
    * For example, "LINE_BREAKER = ([\r\n]+)" is equivalent to the default rule. 
    * The contents of the first matching group will not occur in either the previous or next events.
    * NOTE: There is a significant speed boost by using the LINE_BREAKER to delimit multiline events 
    rather than using line merging to reassemble individual lines into events.

LINE_BREAKER_LOOKBEHIND = <integer> (100)
      * Change the default lookbehind for the regex based linebreaker. 
      * When there is leftover data from a previous raw chunk, this is how far before the end
    the raw chunk (with the next chunk concatenated) we should begin applying
    the regex.

# Multiline events

SHOULD_LINEMERGE = <true/false>
      * When set to true, Splunk combines several input lines into a single event, based on the 
      following configuration attributes.
      * Defaults to true.

# The following are used only when SHOULD_LINEMERGE = True

AUTO_LINEMERGE = <true/false>
    * Directs Splunk to use automatic learning methods to determine where to break lines in events.
    * Defaults to true.

BREAK_ONLY_BEFORE_DATE = <true/false>
      * When set to true, Splunk will create a new event if and only if it encounters
    a new line with a date.
    * Defaults to false.

BREAK_ONLY_BEFORE = <regular expression>
     * When set, Splunk will create a new event if and only if it encounters
    a new line that matches the regular expression.
    * Defaults to empty.

MUST_BREAK_AFTER = <regular expression>
      * When set, and the regular expression matches the current line,
    Splunk is guaranteed to create a new event for the next input line.
    * Splunk may still break before the current line if another rule matches.
    * Defaults to empty.

MUST_NOT_BREAK_AFTER = <regular expression>
    * When set and the current line matches the regular expression, Splunk will
    not break on any subsequent lines until the MUST_BREAK_AFTER expression
       matches.
       * Defaults to empty.

MUST_NOT_BREAK_BEFORE = <regular expression>
      * When set and the current line matches the regular expression, Splunk will not break the last 
      event before the current line.
      * Defaults to empty.

MAX_EVENTS = <integer>
      * Specifies the maximum number of input lines that will be added to any event. 
      * Splunk will break after the specified number of lines are read.
      * Defaults to 256.

#******************************************************************************
# Timestamp extraction configuration
#******************************************************************************

DATETIME_CONFIG = <filename relative to $SPLUNK_HOME>
    * Specifies the file to configure the timestamp extractor.
    * This configuration may also be set to "NONE" to prevent the timestamp extractor from running 
    or "CURRENT" to assign the current system time to each event.
    * Defaults to /etc/datetime.xml (eg $SPLUNK_HOME/etc/datetime.xml).

MAX_TIMESTAMP_LOOKAHEAD = <integer>
    * Specifies how far (in characters) into an event Splunk should look for a timestamp.
      * Defaults to 150.

TIME_PREFIX = <regular expression>
     * Specifies the necessary condition for a timestamp to be extracted. 
     * The timestamping algorithm will only look for a timestamp after the first regex match.
     * Defaults to empty.

TIME_FORMAT = <strptime-style format>
    * Specifies a strptime format string to extract the date. 
    * This method of date extraction does not support in-event timezones. 
    * TIME_FORMAT starts reading after the TIME_PREFIX. 
    * The <strptime-style format> must contain the hour, minute, month, and day.
    * Defaults to empty.

TZ = <posix timezone string>
     * The algorithm for determining the time zone for a particular event is as follows:
      - If the event has a timezone in its raw text (e.g., UTC, -08:00), use that.
      - If TZ is set to a valid timezone string, use that.
      - Otherwise, use the timezone of the system that is running splunkd.
    * Defaults to empty.

MAX_DAYS_AGO = <integer>
      * Specifies the maximum number of days past, from the current date, for an extracted date to be valid.  
      * If set to 10, for example, dates that are older than 10 days ago are ignored.
    * Defaults to 1000.
    * IMPORTANT: If your data is older than 1000 days, you must change this setting.

MAX_DAYS_HENCE = <integer>
     * Specifies the maximum number days in the future, from the current date, 
     for an extracted date to be valid.  
     * If set to 3, for example, dates that are more than 3 days in the future will be ignored.  
     * False positives are less likely with a tighter window.
     * The default value allows dates that are tomorrow.  
     * If your machines have the wrong date set or are in a timezone that is one day ahead, 
     increase this value to at least 3.
    * Defaults to 2.

#******************************************************************************
# Transform configuration
#******************************************************************************

# You can use TRANSFORMS or REPORT to create extracted fields.
# Please note that TRANSFORMS should only be used when speed is of the essence
# and the extraction rule will not change.  REPORTS allows for more flexibility.
# For more information, see documentation at: http://www.splunk.com/doc/latest/admin/ExtractFields

TRANSFORMS<class> = <"transform name","transform name 2",...> {see transforms.conf.spec}
      * Splunk configures classes of regular expressions for each event.  
      * For each class, Splunk takes the configuration from the highest precedence configuration block
      (see precedence rules at the beginning of this file).
      * If a particular class is specified for a source, it will override the same class if it is 
      specified for a sourcetype. 
      * Similarly, if a particular class is specified in the local bundle for a sourcetype, it will 
      override that class for the default bundle for that sourcetype.
 
    * The following is an example TRANSFORMS class in the default bundle for all sourcetypes:

        TRANSFORMS-annotation = filetype,loglevel,os,browser,language,ip,email,url

# Report configuration

REPORT<class> = <"transform name","transform name",...> {see transforms.conf.spec}
      * Like TRANSFORMS, this configures extractions, but run at search time. 
      * TRANSFORMS are not run at search time, only at index time.

KV_MODE = <none/auto/multi>
     * Specifies the key/value extraction mode for the data. 
     * Set KV_MODE to one of the following:
    - "none" if you want no key/value extraction to take place.
    - "auto" extracts key/value pairs separated by equal signs.
    - "multi" invokes multikv to expand a tabular event into multiple events.
      * Defaults to auto.
Previous: prefs.conf    |    Next: props.conf (cont)

Comments

  1. here's a trick for making log4j's SyslogAppender work remotely with multi-line events. On the sender side:

    log4j.appender.MySyslog=org.apache.log4j.net.SyslogAppender
    log4j.appender.MySyslog.SyslogHost=<splunk host>
    log4j.appender.MySyslog.layout=org.apache.log4j.PatternLayout
    log4j.appender.MySyslog.layout.ConversionPattern=<uniquely identifiable pattern>

    for the uniquely identifiable pattern, you could use something blatant like: "LOG4J_EVENT_COMING_RIGHT_UP" or a more useful, but obvious pattern like %p %t - %d{ISO8601} - %m%n

    On the splunk side, create a source where BREAK_ONLY_BEFORE to break on your <uniquely identifiable pattern>, for example:

    [my_log4j]
    pulldown_type = true
    maxDist = 75
    # get rid of syslog headers
    TRANSFORMS = syslog-header-stripper-ts-host
    # even though it's a default, make sure you set linemerge in case something attempts to override
    SHOULD_LINEMERGE = true
    BREAK_ONLY_BEFORE = <regex that'll match your uniquely identifiable pattern>

    the regex could be .*LOG4J_COMING_RIGHT_UP.* in the overt case or ^.*-\s\d\d\d\d-\d\d-\d\d\s\d\d:\d\d:\d\d,\d\d\d\s-.*$ in the ISO date case

Log in to comment.