Documentation:
3.2.3
props.conf controls what parameters apply to events during indexing based on settings tied to each event's source, host, or sourcetype.
NOTE: Use wildcards in your props.conf <spec> (much the same as in inputs.conf). You can only use wildcards for host or source. Use ... for paths and * for files.
# This file contains possible attribute/value pairs for configuring Splunk's processing properties
# via props.conf.
#
# There is a props.conf in $SPLUNK_HOME/etc/bundles/default/. To set custom configurations,
# place a props.conf in your own custom bundle directory.
#
# For help creating a bundle directory, or to learn more about bundles (including bundle precedence)
# please see the documentation located at http://www.splunk.com/doc/latest/admin/bundleconfig.
[<spec>]
* This stanza enables properties for a given <spec>.
* A props.conf file can contain multiple stanzas with different <specs>.
* Follow this stanza name with any number of the following attribute/value pairs.
* If you do not set an attribute for a given <spec> the default is used.
<spec> can be:
1. <sourcetype>, the sourcetype of an event.
2. host::<host>, where <host> is the host for an event.
3. eventtype::<eventtype> where <eventtype> is any valid event type that is either pre-defined in
Splunk or defined in eventtypes.conf.
NOTE: eventtype can only be used as a spec for creating extracted fields with REPORT<class>.
4. source::<source>, where <source> is the source for an event.
5. rule::<rulename>, where <rulename> is a unique name of a sourcetype classification rule.
6. delayedrule::<rulename>, where <rulename> is a unique name of a
delayed sourcetype classification rule. These are only considered
as a last resort before generating a new sourcetype based on the
source seen.
NOTE: When specifying a <spec>, you can use the following regex-type syntax:
... = will recurse through directories until the match is met.
* = matches anything but / 0 or more times.
| = or
( ) = used to limit scope of |.
Example: [source::....(?<!tar.)(gz|tgz)]
#******************************************************************************
# The possible attributes/value pairs for props.conf, and their default values, are:
#******************************************************************************
# International characters
CHARSET = <string>
* When set, Splunk will assume the input from the given <spec> is in the specified encoding.
* A list of valid encodings can be retrieved using the command "iconv -l" on most *nix systems.
* If an invalid encoding is specified, a warning is logged during initial configuration
and further input from that <spec> is discarded.
* If the source encoding is valid, but some characters from the <spec> are not valid in the
specified encoding, then the characters will be escaped as hex (e.g. "\xF3").
* Defaults to ASCII.
* When set to "AUTO", Splunk will attempt to automatically determine the character encoding and
convert text from that encoding to UTF-8.
* For a complete list of the some 20 character sets that Splunk automatically detects, see the
online documentation.
#******************************************************************************
# Line breaking
#******************************************************************************
TRUNCATE = <non-negative integer>
* Change the default maximum line length.
* Set to 0 if you do not want truncation ever (very long lines are, however, often a sign of
garbage data).
* Defaults to 10000.
LINE_BREAKER = <regular expression>
* If not set, the raw stream will be broken into an event for each line delimited by \r or \n.
* If set, the given regex will be used to break the raw stream into events.
* The regex must contain a matching group.
* Wherever the regex matches, the start of the first matched group is considered the first text NOT in the
previous event.
* The end of the first matched group is considered the end of the delimiter and the next
character is considered the beginning of the next event.
* For example, "LINE_BREAKER = ([\r\n]+)" is equivalent to the default rule.
* The contents of the first matching group will not occur in either the previous or next events.
* NOTE: There is a significant speed boost by using the LINE_BREAKER to delimit multiline events
rather than using line merging to reassemble individual lines into events.
LINE_BREAKER_LOOKBEHIND = <integer> (100)
* Change the default lookbehind for the regex based linebreaker.
* When there is leftover data from a previous raw chunk, this is how far before the end
the raw chunk (with the next chunk concatenated) we should begin applying
the regex.
# Multiline events
SHOULD_LINEMERGE = <true/false>
* When set to true, Splunk combines several input lines into a single event, based on the
following configuration attributes.
* Defaults to true.
# The following are used only when SHOULD_LINEMERGE = True
AUTO_LINEMERGE = <true/false>
* Directs Splunk to use automatic learning methods to determine where to break lines in events.
* Defaults to true.
BREAK_ONLY_BEFORE_DATE = <true/false>
* When set to true, Splunk will create a new event if and only if it encounters
a new line with a date.
* Defaults to false.
BREAK_ONLY_BEFORE = <regular expression>
* When set, Splunk will create a new event if and only if it encounters
a new line that matches the regular expression.
* Defaults to empty.
MUST_BREAK_AFTER = <regular expression>
* When set, and the regular expression matches the current line,
Splunk is guaranteed to create a new event for the next input line.
* Splunk may still break before the current line if another rule matches.
* Defaults to empty.
MUST_NOT_BREAK_AFTER = <regular expression>
* When set and the current line matches the regular expression, Splunk will
not break on any subsequent lines until the MUST_BREAK_AFTER expression
matches.
* Defaults to empty.
MUST_NOT_BREAK_BEFORE = <regular expression>
* When set and the current line matches the regular expression, Splunk will not break the last
event before the current line.
* Defaults to empty.
MAX_EVENTS = <integer>
* Specifies the maximum number of input lines that will be added to any event.
* Splunk will break after the specified number of lines are read.
* Defaults to 256.
#******************************************************************************
# Timestamp extraction configuration
#******************************************************************************
DATETIME_CONFIG = <filename relative to $SPLUNK_HOME>
* Specifies the file to configure the timestamp extractor.
* This configuration may also be set to "NONE" to prevent the timestamp extractor from running
or "CURRENT" to assign the current system time to each event.
* Defaults to /etc/datetime.xml (eg $SPLUNK_HOME/etc/datetime.xml).
MAX_TIMESTAMP_LOOKAHEAD = <integer>
* Specifies how far (in characters) into an event Splunk should look for a timestamp.
* Defaults to 150.
TIME_PREFIX = <regular expression>
* Specifies the necessary condition for a timestamp to be extracted.
* The timestamping algorithm will only look for a timestamp after the first regex match.
* Defaults to empty.
TIME_FORMAT = <strptime-style format>
* Specifies a strptime format string to extract the date.
* This method of date extraction does not support in-event timezones.
* TIME_FORMAT starts reading after the TIME_PREFIX.
* The <strptime-style format> must contain the hour, minute, month, and day.
* Defaults to empty.
TZ = <posix timezone string>
* The algorithm for determining the time zone for a particular event is as follows:
- If the event has a timezone in its raw text (e.g., UTC, -08:00), use that.
- If TZ is set to a valid timezone string, use that.
- Otherwise, use the timezone of the system that is running splunkd.
* Defaults to empty.
MAX_DAYS_AGO = <integer>
* Specifies the maximum number of days past, from the current date, for an extracted date to be valid.
* If set to 10, for example, dates that are older than 10 days ago are ignored.
* Defaults to 1000.
* IMPORTANT: If your data is older than 1000 days, you must change this setting.
MAX_DAYS_HENCE = <integer>
* Specifies the maximum number days in the future, from the current date,
for an extracted date to be valid.
* If set to 3, for example, dates that are more than 3 days in the future will be ignored.
* False positives are less likely with a tighter window.
* The default value allows dates that are tomorrow.
* If your machines have the wrong date set or are in a timezone that is one day ahead,
increase this value to at least 3.
* Defaults to 2.
#******************************************************************************
# Transform configuration
#******************************************************************************
# You can use TRANSFORMS or REPORT to create extracted fields.
# Please note that TRANSFORMS should only be used when speed is of the essence
# and the extraction rule will not change. REPORTS allows for more flexibility.
# For more information, see documentation at: http://www.splunk.com/doc/latest/admin/ExtractFields
TRANSFORMS<class> = <"transform name","transform name 2",...> {see transforms.conf.spec}
* Splunk configures classes of regular expressions for each event.
* For each class, Splunk takes the configuration from the highest precedence configuration block
(see precedence rules at the beginning of this file).
* If a particular class is specified for a source, it will override the same class if it is
specified for a sourcetype.
* Similarly, if a particular class is specified in the local bundle for a sourcetype, it will
override that class for the default bundle for that sourcetype.
* The following is an example TRANSFORMS class in the default bundle for all sourcetypes:
TRANSFORMS-annotation = filetype,loglevel,os,browser,language,ip,email,url
# Report configuration
REPORT<class> = <"transform name","transform name",...> {see transforms.conf.spec}
* Like TRANSFORMS, this configures extractions, but run at search time.
* TRANSFORMS are not run at search time, only at index time.
KV_MODE = <none/auto/multi>
* Specifies the key/value extraction mode for the data.
* Set KV_MODE to one of the following:
- "none" if you want no key/value extraction to take place.
- "auto" extracts key/value pairs separated by equal signs.
- "multi" invokes multikv to expand a tabular event into multiple events.
* Defaults to auto.
Comments
here's a trick for making log4j's SyslogAppender work remotely with multi-line events. On the sender side:
log4j.appender.MySyslog=org.apache.log4j.net.SyslogAppender
log4j.appender.MySyslog.SyslogHost=<splunk host>
log4j.appender.MySyslog.layout=org.apache.log4j.PatternLayout
log4j.appender.MySyslog.layout.ConversionPattern=<uniquely identifiable pattern>
for the uniquely identifiable pattern, you could use something blatant like: "LOG4J_EVENT_COMING_RIGHT_UP" or a more useful, but obvious pattern like %p %t - %d{ISO8601} - %m%n
On the splunk side, create a source where BREAK_ONLY_BEFORE to break on your <uniquely identifiable pattern>, for example:
[my_log4j]
pulldown_type = true
maxDist = 75
# get rid of syslog headers
TRANSFORMS = syslog-header-stripper-ts-host
# even though it's a default, make sure you set linemerge in case something attempts to override
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = <regex that'll match your uniquely identifiable pattern>
the regex could be .*LOG4J_COMING_RIGHT_UP.* in the overt case or ^.*-\s\d\d\d\d-\d\d-\d\d\s\d\d:\d\d:\d\d,\d\d\d\s-.*$ in the ISO date case
Posted by xoopit on Apr 23 2008, 9:07am