Documentation: 3.1.5
Print Version Contents
This page last updated: 01/22/08 01:01pm

props.conf

props.conf controls what parameters apply to events during indexing based on settings tied to each event's source, host, or sourcetype.

IMPORTANT: You can use wildcards in your props.conf <spec> much the same as in inputs.conf. You can only use wildcards for host or source.

  • Use ... for paths and * for files:
    • ... will recurse through directories until the match is met. This means that /foo/.../bar will match foo/bar, foo/1/bar, foo/1/2/bar, etc.
    • * will match anything in that specific path segment. It cannot be used inside of a directory path, it must be used in the last segment of the path. For example /foo/*.log will match /foo/bar.log but not match /foo/bar.txt or /foo/bar/test.log.

props.conf.spec

# Copyright (C) 2005-2007 Splunk Inc.  All Rights Reserved.  Version 3.0 
#

# This file contains possible attribute/value pairs for a "props.conf" file.
#
# The processing properties of Splunk are configured through the files
# $SPLUNK_HOME/etc/bundles/<bundle name>/props.conf

# There is a props.conf in $SPLUNK_HOME/etc/bundles/default/.  To set custom configurations, 
# place an props.conf in $SPLUNK_HOME/etc/bundles/local/ or your own custom bundle directory.

# Here is an example props.conf stanza:

[<spec>]
attribute1 = val1
attribute2 = val2
...

# A props.conf file can contain multiple stanzas with different <specs>.

<spec> can be:
1. <sourcetype>, the sourcetype of an event.
2. host::<host>, where <host> is the host for an event.
3. reportinghost::<host>, where <host> is the host reporting an event.
4. source::<source>, where <source> is the source for an event.
5. rule::<rulename>, where <rulename> is a unique name of a sourcetype classification rule.
6. delayedrule::<rulename>, where <rulename> is a unique name of a
   delayed sourcetype classification rule.  These are only considered
   as a last resort before generating a new sourcetype based on the
   source seen.

If the same <spec> is found in two bundle directories, the following precedence rules apply:

Attributes in $SPLUNK_HOME/etc/bundles/local are read first. 
Attributes in $SPLUNK_HOME/etc/bundles/default are read last. 
Attributes in other directories are loaded in alphabetical order by name.

Overriding is performed attribute by attribute, so if a specific attribute is not specified in 
"local", but in another bundle, it will be taken from that other bundle.

**************

The possible attributes/value pairs for props.conf, and their default values, are:

# International characters

CHARSET = <string>
        * When set, Splunk will assume the input from the given <spec> is in the specified encoding.  
        * A list of valid encodings can be retrieved using the command "iconv -l" on most *nix systems.  
        * If an invalid encoding is specified, a warning will be logged during initial configuration 
        and further input from that <spec> will be discarded.  
        * If the source encoding is valid, but some characters from the <spec> are not valid in the
        specified encoding, then the characters will be escaped as hex (e.g. "\xF3").
        * Defaults to ASCII.

# Line breaking

TRUNCATE = <non-negative integer>
        * Change the default maximum line length.  
        * Set to 0 if you do not want truncation ever (very long lines are, however, often a sign of 
        garbage data).
    * Defaults to 10000.
    
    

# Multiline events

SHOULD_LINEMERGE = <true/false>
        * When set to true, Splunk combines several input lines into a single event, based on the 
        following configuration attributes.
        * Defaults to true.
        
        

# The following are used only when SHOULD_LINEMERGE = True

AUTO_LINEMERGE = <true/false>
        * Directs Splunk to use automatic learning methods to determine where to break lines into events.
        * Defaults to true.

BREAK_ONLY_BEFORE_DATE = <true/false>
        * When set to true, Splunk will create a new event if and only if it encounters
    a new line with a date.
    * Defaults to false.

BREAK_ONLY_BEFORE = <regular expression>
        * When set, Splunk will create a new event if and only if it encounters
    a new line that matches the regular expression.
    * Defaults to empty.

MUST_BREAK_AFTER = <regular expression>
        * When set, and the regular expression matches the current line,
    Splunk is guaranteed to create a new event for the next input line.
    * Splunk may still break before the current line if another rule matches.
    * Defaults to empty.

MUST_NOT_BREAK_AFTER = <regular expression>
        * When set and the current line matches the regular expression, Splunk will
    not break on any subsequent lines until the MUST_BREAK_AFTER expression
        matches.
        * Defaults to empty.

MUST_NOT_BREAK_BEFORE = <regular expression>
        * When set and the current line matches the regular expression, Splunk will not break the last 
        event before the current line.
        * Defaults to empty.

MAX_EVENTS = <integer>
        * Specifies the maximum number of input lines that will be added to any event. 
        * Splunk will break after the specified number of lines are read.
        * Defaults to 256.
        
     

# Timestamp extraction configuration

DATETIME_CONFIG = <filename relative to $SPLUNK_HOME>
        * Specifies the file to configure the timestamp extractor.
    * This configuration may also be set to "NONE" to prevent the timestamp extractor from running 
    or "CURRENT" to assign the current system time to each event.
    * Defaults to /etc/datetime.xml (eg $SPLUNK_HOME/etc/datetime.xml).

MAX_TIMESTAMP_LOOKAHEAD = <integer>
        * Specifies how far (in characters) into an event Splunk should look for a timestamp.
        * Defaults to 150.

TIME_PREFIX = <regular expression>
        * Specifies the necessary condition for a timestamp to be extracted. 
        * The timestamping algorithm will only look for a timestamp after the prefix in the event.
        * Defaults to empty.

TIME_FORMAT = <strptime-style format>
        * Specifies a strptime format string to extract the date. 
        * This method of date extraction does not support in-event timezones. 
        * TIME_FORMAT starts reading after the TIME_PREFIX. 
        * The <strptime-style format> must contain the hour, minute, month, and day.
        * Defaults to empty.

TZ = <posix timezone string>
        * The algorithm for determining the time zone for a particular event is as follows:
      - If the event has a timezone in its raw text (e.g., UTC, -08:00), use that as the timezone.
      - If TZ is set to a valid timezone string, use that as the timezone for the event.
      - Otherwise, use the timezone of the system that is running splunkd.
    * Defaults to empty.

MAX_DAYS_AGO = <integer>
        * Specifies the maximum number of days past, from the current date, for an extracted date to be valid.  
        * If set to 10, for example, date that are older than 10 days ago are ignored.
        * Defaults to 1000.
        * PLEASE NOTE:  if your data is older than 1000 days, you must change this setting.

MAX_DAYS_HENCE = <integer>
        * Specifies the maximum number days in the future, from the current date, 
        for an extracted date to be valid.  
        * If set to 3, for example, dates that are more than 3 days in the future will be ignored.  
        * False positives are less likely with a tighter the window.
        * The default value allows dates that are tomorrow.  
        * If your machines have the wrong date set or are in a timezone that is one day ahead, 
        increase this value to at least 3.
    * Defaults to 2.

# Transform configuration

# You can use TRANSFORMS or REPORT to create extracted fields.
# Please note that TRANSFORMS should only be used when speed is of the essence
# and the extraction rule will not change.  REPORTS allows for more flexibility.
# For more information, see documentation at: http://www.splunk.com/doc/3.1.5/admin/ExtractFields

TRANSFORMS<class> = <"transform name","transform name",...> {see transforms.conf.spec}
        * Splunk configures classes of regular expressions for each event.  
        * For each class, Splunk takes the configuration from the highest precedence configuration block
        (see precedence rules at the beginning of this file).
        * If a particular class is specified for a source, it will override the same class if it is 
        specified for a sourcetype. 
        * Similarly, if a particular class is specified in the local bundle for a sourcetype, it will 
        override that class for the default bundle for that sourcetype.
 
    * The following is an example TRANSFORMS class in the default bundle for
    all sourcetypes:

                TRANSFORMS-annotation = filetype,loglevel,os,browser,language,ip,email,url

# Report configuration

REPORT<class> = <"transform name","transform name",...> {see transforms.conf.spec}
        * Like TRANSFORMS, this configures extractions, but only those which should be run at report time. 
        * TRANSFORMS are not run at report time, only at index time.

KV_MODE = <none/auto/multi>
        * Specifies the key/value extraction mode for the data. 
        * Set KV_MODE to :
    -- "none" if you want no key/value extraction to take place.
    -- "auto" extracts key/value pairs separated by equal signs.
    -- "multi" invokes multikv to expand a tabular event into multiple events.
        * Defaults to auto.

# Sourcetype configuration

sourcetype = <string>
        * Can only be set for a [<source>::...] stanza.
        * Anything from that <source> is assigned the specified sourcetype.
    * Defaults to empty.
    

# The following attribute/value pairs can only be set for a stanza that begins with [<sourcetype>]:

invalid_cause = <string>
        * Can only be set for a [sourcetype] stanza.
        * Splunk will not index any data with invalid_cause set.
        * Set <string> to "archive" to send the file to the archive processor (specified in unarchive_cmd).
        * Set to any other string to throw an error in the splunkd.log if running Splunklogger in debug mode.
        * Defaults to empty.
        
is_valid = <true/false>
        * Automatically set by invalid_cause.
        * DO NOT SET THIS.
    * Defaults to true.

unarchive_cmd = <string>
        * Only called if invalid_cause is set to "archive".
        * <string> specifies the shell command to run to extract an archived source.
        * Must be a shell command that takes input on stdin and produces output on stdout.
    * DOES NOT WORK ON BATCH PROCESSED FILES. Use preprocessing_script.
    * Defaults to empty.

preprocessing_script = <string>
        * Can only be set for a [sourcetype] stanza.
        * For batch processing, run a preprocessing script on the data stream using the binary found in
    $SPLUNK_HOME/bin.
        * DOES NOT WORK ON TAILING.  Use unarchive_cmd.
        * Defaults to empty.

LEARN_MODEL = <true/false>
        * For known sourcetypes, the fileclassifier will add a model file to the learned bundle.
        * To disable this behavior for diverse sourcetypes (such as sourcecode, where there is no good
        exemplar to make a sourcetype) set LEARN_MODEL = false.
        * Defaults to empty.

maxDist = <int>
        * Determines how different a sourcetype model may be from the current file.  
        * The larger the value, the more forgiving.
    * For example, if the value is very small (e.g., 10), then files of the specified 
    sourcetype should not vary much.
    * A larger value indicates that files of the given sourcetype vary quite a bit.
    * Defaults to 300.
    

# rule:: and delayedrule:: configuration

MORE_THAN<optional_unique_value>_<number> = <regular expression> (empty)
LESS_THAN<optional_unique_value>_<number> = <regular expression> (empty)

An example attribute value would be:

           [rule::bar_some]
           sourcetype = source_with_lots_of_bars
           # if more than 80% of lines have "----", but less than 70% have "####"
           # declare this a "source_with_lots_of_bars"
           MORE_THAN_80 = ----
           LESS_THAN_70 = ####

     A rule can have many MORE_THAN and LESS_THAN patterns, and all
     are required for the rule to match.

# Segmentation configuration

SEGMENTATION = <string>
        * Specifies the segmenter from segmenters.conf to use at index time.
        * You can set segmentation for any of the <spec> outlined at the top of this file.

SEGMENTATION-<segment selection> = <string>
        * Specifies that SplunkWeb should use the a specific segmenter for the given <segment selection>
        choice. 
        * Example segment selection choices are: all, inner, outer, raw.
        
        

# Binary file configuration

NO_BINARY_CHECK = <bool>
        * When set to true, Splunk will process binary files.
    * By default, binary files are ignored.
    * Defaults to false.
    
    

# File checksum configuration

CHECK_METHOD = <entire_md5, modtime>
        * By default, if the checksums of the first and last 256 bytes of a file match existing stored 
        checksums, Splunk lists the file as already indexed and thus ignores it.
    * Set this to "entire_md5" to use the checksum of the entire file.
    * Alternatively, set this to "modtime" to check only the modification time of the file.
    * Defaults to endpoint_md5.
    
    

    
     
 # Internal settings

# NOT YOURS.  DO NOT SET.

_actions = <string> ("new,edit,delete")
   * Internal field used for user-interface control of objects.
   * Defaults to "new,edit,delete".

pulldown_type = <bool>
   * Internal field used for user-interface control of sourcetypes.
   * Defaults to empty.

props.conf.example

# Copyright (C) 2005-2007 Splunk Inc.  All Rights Reserved.  Version 3.0 
#

# The following are example props.conf configurations.
# To use one or more of these configurations, copy the configuration block into
# segmenters.conf in $SPLUNK_HOME/etc/bundles/local/ (or your own custom bundle).

########
# Line merging settings
########

# The following example will linemerge source data into multi-line events for apache_error sourcetype.

[apache_error]
SHOULD_LINEMERGE = True

########
# Settings for tuning
########

# The following example limits the amount of characters indexed per event from host::small_events.

[host::small_events]
TRUNCATE = 256

# The following example turns off DATETIME_CONFIG (which can speed up indexing) from any path
# that has ends in /mylogs/*.log.

[source::.../mylogs/*.log]
DATETIME_CONFIG = NONE

  
########
# Timestamp extraction configuration
########

# The following example sets Eastern Time Zone if host matches nyc*.

[host::nyc*]
# from 2007 onward
TZ = EST-5EDT,M3.2.0,M11.1.0
# 2006 and before:
# TZ EST-5EDT,M4.1.0/02:00:00,M10.5.0/02:00:00

# The following example uses a custom datetime.xml that has been created and placed in a custom bundle.
# This will set all events coming in from hosts starting with LA to use this custom file.

[host::LA*]
DATETIME_CONFIG = <etc/bundles/custom_time/datetime.xml>

########
# Transform configuration
########

# The following example will create a search field for host::foo if tied to a stanza in transforms.conf.

[host::foo]
TRANSFORMS-foo=foobar

# The following example will create an extracted field for sourcetype access_combined
# if tied to a stanza in transforms.conf.

[access_combined]
REPORT-baz = foobaz

########
# Sourcetype configuration
########

# The following example sets a sourcetype for the file web_acces.log.

[source::.../web_access.log]
sourcetype = splunk_web_access 

# The following example will untar syslog events.

[syslog]
invalid_cause = archive
unarchive_cmd = gzip -cd -
        

# The following example learns a custom sourcetype and limits the range between different examples
# with a smaller than default maxDist.

[custom_sourcetype]
LEARN_MODEL = true
maxDist = 30

# rule:: and delayedrule:: configuration
# The following examples create sourectype rules for custom sourcetypes with custom regex.

[rule::bar_some]
sourcetype = source_with_lots_of_bars
MORE_THAN_80 = ----

[delayed::baz_some]
sourcetype = my_sourcetype
LESS_THAN_70 = ####

########        
# File configuration
########

# Binary file configuration
# The following example will eat binary files from the host::sourcecode.

[host::sourcecode]
NO_BINARY_CHECK = true 
    

# File checksum configuration
# The following example will check the entirety of every file in the web_access dir rather than 
# skipping files that appear to be the same.

[source::.../web_access/*]
CHECK_METHOD = entire_md5
Previous: prefs.conf    |    Next: savedsearches.conf

Comments

No comments have been submitted.

Log in to comment.