Documentation: 3.2.3
Print Version Contents
This page last updated: 04/21/08 04:04pm

Configure event boundaries

Many event logs have a strict one-line-per-event format, but some do not. Usually, Splunk can figure out where event boundaries are automatically. However, if event boundary recognition is not working as desired, set custom rules by configuring props.conf.

Configuration

To configure multi-line events, examine the format of the events. Determine a pattern in the events to set as the start or end of an event. Then, edit $SPLUNK_HOME/etc/bundles/local/props.conf, and set the necessary attributes for your data handling.

There are two ways to handle multiline events

1) Break the event stream into real events. This is recommended, as it increases indexing speed significantly. Use LINE_BREAKER (see below).

2) Break the event stream into lines, and reassemble. This is slower, but affords more robust configuration options. Use any line-breaking attribute besides LINE_BREAKER (see below).

Here are possible attributes to set for line-breaking rules from $SPLUNK_HOME/etc/bundles/README/props.conf.spec:

TRUNCATE = <non-negative integer>
      * Change the default maximum line length.  
      * Set to 0 if you do not want truncation ever (very long lines are, however, often a sign of 
      garbage data).
       * Defaults to 10000.

LINE_BREAKER = <regular expression>
      * If not set, the raw stream will be broken into an event for each line delimited by \r or \n. 
    * If set, the given regex will be used to break the raw stream into events.
        * The regex must contain a matching group. 
        * Wherever the regex matches, the start of the first matched group is considered the first text NOT in the
          previous event. 
    * The end of the first matched group is considered the end of the delimiter and the next 
    character is considered the beginning of the next event. 
    * For example, "LINE_BREAKER = ([\r\n]+)" is equivalent to the default rule. 
    * The contents of the first matching group will not occur in either the previous or next events.
    * NOTE: There is a significant speed boost by using the LINE_BREAKER to delimit multiline events 
    rather than using line merging to reassemble individual lines into events.

LINE_BREAKER_LOOKBEHIND = <integer> (100)
      * Change the default lookbehind for the regex based linebreaker. 
      * When there is leftover data from a previous raw chunk, this is how far before the end
    the raw chunk (with the next chunk concatenated) we should begin applying
    the regex.

SHOULD_LINEMERGE = <true/false>
      * When set to true, Splunk combines several input lines into a single event, based on the 
      following configuration attributes.
      * Defaults to true.
      
# The following are used only when SHOULD_LINEMERGE = True

AUTO_LINEMERGE = <true/false>
    * Directs Splunk to use automatic learning methods to determine where to break lines in events.
    * Defaults to true.

BREAK_ONLY_BEFORE_DATE = <true/false>
      * When set to true, Splunk will create a new event if and only if it encounters
    a new line with a date.
    * Defaults to false.

BREAK_ONLY_BEFORE = <regular expression>
     * When set, Splunk will create a new event if and only if it encounters
    a new line that matches the regular expression.
    * Defaults to empty.

MUST_BREAK_AFTER = <regular expression>
      * When set, and the regular expression matches the current line,
    Splunk is guaranteed to create a new event for the next input line.
    * Splunk may still break before the current line if another rule matches.
    * Defaults to empty.

MUST_NOT_BREAK_AFTER = <regular expression>
    * When set and the current line matches the regular expression, Splunk will
    not break on any subsequent lines until the MUST_BREAK_AFTER expression
       matches.
       * Defaults to empty.

MUST_NOT_BREAK_BEFORE = <regular expression>
      * When set and the current line matches the regular expression, Splunk will not break the last 
      event before the current line.
      * Defaults to empty.

MAX_EVENTS = <integer>
      * Specifies the maximum number of input lines that will be added to any event. 
      * Splunk will break after the specified number of lines are read.
      * Defaults to 256.

Examples

[my_custom_sourcetype]
BREAK_ONLY_BEFORE = ^\d+\s*$

This example instructs Splunk to divide events in a file or stream by presuming any line that consists of all digits is the start of a new event, for any source whose source type was configured or determined by Splunk to be sourcetype::my_custom_sourcetype .

Another example:

The following log event contains several lines that are part of the same request. The differentiator between requests is "Path". The customer would like all these lines shown as one event entry.

{{"2006-09-21, 02:57:11.58",  122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}
{{"2006-09-21, 02:57:11.60",  122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}
{{"2006-09-21, 02:57:11.60",  122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}

To index this multiple line event properly, use the Path differentiator in your configuration. Add the following to your $SPLUNK_HOME/etc/bundles/local/props.conf:

[source::source-to-break]
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE = Path=

This code tells Splunk to merge the lines of the event, and only break before the term Path=.

Previous: How indexing works    |    Next: Configure segmentation

Comments

No comments have been submitted.

Log in to comment.