This documentation applies to the following versions of Splunk: 4.0 , 4.0.1 , 4.0.2 , 4.0.3 , 4.0.4 , 4.0.5 , 4.0.6
Many event logs have a strict one-line-per-event format, but some do not. Usually, Splunk can figure out where event boundaries are automatically. However, if event boundary recognition is not working as desired, set custom rules by configuring props.conf.
To configure multi-line events, examine the format of the events. Determine a pattern in the events to set as the start or end of an event. Then, edit $SPLUNK_HOME/etc/system/local/props.conf, and set the necessary attributes for your data handling.
There are two ways to handle multiline events
1) Break the event stream into real events. This is recommended, as it increases indexing speed significantly. Use LINE_BREAKER (see below).
2) Break the event stream into lines, and reassemble. This is slower, but affords more robust configuration options. Use any line-breaking attribute besides LINE_BREAKER (see below).
Here are possible attributes to set for line-breaking rules from $SPLUNK_HOME/etc/system/README/props.conf.spec:
TRUNCATE = <non-negative integer>
LINE_BREAKER = <regular expression>
LINE_BREAKER_LOOKBEHIND = <integer> (100)
SHOULD_LINEMERGE = <true/false>
The following are used only when SHOULD_LINEMERGE = True
AUTO_LINEMERGE = <true/false>
BREAK_ONLY_BEFORE_DATE = <true/false>
BREAK_ONLY_BEFORE = <regular expression>
MUST_BREAK_AFTER = <regular expression>
MUST_NOT_BREAK_AFTER = <regular expression>
MUST_NOT_BREAK_BEFORE = <regular expression>
MAX_EVENTS = <integer>
[my_custom_sourcetype] BREAK_ONLY_BEFORE = ^\d+\s*$
This example instructs Splunk to divide events in a file or stream by presuming any line that consists of all digits is the start of a new event, for any source whose source type was configured or determined by Splunk to be sourcetype::my_custom_sourcetype .
Another example:
The following log event contains several lines that are part of the same request. The differentiator between requests is "Path". The customer would like all these lines shown as one event entry.
{{"2006-09-21, 02:57:11.58", 122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}
{{"2006-09-21, 02:57:11.60", 122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}
{{"2006-09-21, 02:57:11.60", 122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}
To index this multiple line event properly, use the Path differentiator in your configuration. Add the following to your $SPLUNK_HOME/etc/system/local/props.conf:
[source::source-to-break] SHOULD_LINEMERGE = True BREAK_ONLY_BEFORE = Path=
This code tells Splunk to merge the lines of the event, and only break before the term Path=.