
Not when it comes to events. Directing Splunk on how and where to chunk your data into events can save you heartache and make Splunk more efficient. As line merging is often the slowest part of the parsing queue, it may be worth spending the extra minutes to tune the configuration for identifying event boundaries.
Consider this: Splunk is capable of recognizing multiple lines in a file as a single event. For example, Splunk will read a verbose Java stack trace intelligently as 1 event containing many lines. It operates on the premise that events may contain more than 1 line. Well, that’s great, but what if your particular data source only writes events as a single line? In this case, turning off the default line merging behavior will save Splunk from doing extra work.
To do this, toggle the SHOULD_LINEMERGE parameter in props.conf.
[mydatasource] SHOULD_LINEMERGE = false
Ok. Very well. What if you already know Splunk can handle multi-line events, but it’s just not behaving like it can on your data? In this case, add rules to tell it exactly where to start and end an event.
To do this, many options are available in props.conf:
LINE_BREAKER = <regular expression> BREAK_ONLY_BEFORE_DATE = true | false BREAK_ONLY_BEFORE = <regular expression> MUST_BREAK_AFTER = <regular expression> MUST_NOT_BREAK_AFTER = <regular expression> MUST_NOT_BREAK_BEFORE = <regular expression> MAX_EVENTS = <integer>
These options are all documented in $SPLUNK_HOME/etc/system/README/props.conf.spec and here with examples.
It’s important to get the event boundaries correct so you can have a happier Splunk with more accurate event counts/statistics and smarter search capabilities.
----------------------------------------------------
Thanks!
Vi Ly