Topics

| pdf version

How Splunk Works


Splunk > The IT Search Company

  • Search and navigate IT data from applications, servers and network devices in real-time.
  • Download Splunk

Localized Splunk documentation

Looking for Splunk documentation in other languages?

Configure auto-discovery

This documentation does not apply to the most recent version of Splunk.

This documentation applies to the following versions of Splunk: 3.0 , 3.0.1 , 3.0.2 , 3.1 , 3.1.1 , 3.1.2 , 3.1.3 , 3.1.4

Configure auto-discovery

You can define your own event types or have Splunk discover and assign event types. Splunk's event type discovery method uses a combination of punctuation characters, source type, and keywords.


Splunk classifies events in the following way:


  • As events are processed in the pipeline, they are classified into event buckets based on the first few punctuation characters and source type.
  • When the event type bucket has a few hundred entries, the entries are scanned to find a common keyword.
  • The bucket is split into two buckets - those that have the keyword and those that do not.
  • Every so often, the most popular buckets are written out as event types.
  • Event type definitions are placed in $SPLUNK_HOME/etc/bundles/learned/eventtypes.conf.

By default, Splunk's event type auto-discovery is tuned low. If you want to discover more event types, you can create your own event type discovery rules by editing eventdiscoverer.conf. eventdiscoverer.conf contains event classification parameters such as the number of events to process for event discovery, the maximum size of the punctuation pattern to write to event types and which keywords to process or ignore. If you wish to modify an event discovery configuration, edit $SPLUNK_HOME/etc/bundles/local/eventdiscoverer.conf or place a modified eventdiscover.conf in a custom bundle.


IMPORTANT: Many of these values will affect search and indexing performance. Try out your configuration in a test environment to make sure you have the best balance of event discovery versus performance.


Configuration

Edit $SPLUNK_HOME/etc/bundles/local/eventdiscoverer.conf. You can override any values in $SPLUNK_HOME/etc/bundles/default/eventdiscoverer.conf.


Here is a list of the attribute/value pairs you can set in eventdiscoverer.conf.


The main values you can change to tune event type discovery:

  • process_every_n_events = <integer 1-inf.>
    • Consider every N events to discover event types.
    • The larger the value, the faster indexing will be, but the lower the rate at which event types will be discovered.
    • Defaults to 10000000.
  • learn_every_n_events = <integer 1-inf.>
    • Pause to discover new event types for every N events that are processed.
    • Defaults to 5000.

These two values -- process_every_n_events and learn_every_n_events -- are the major settings for tuning auto-discovery. If you set process_every_n_events to 1000 and learn_every_n_events to 5, event typer will process an event every 1000 events, but will only try to learn one out of every 5 of the 1 out of 1000 that were processed. You can turn event discovery up if you set these to lower numbers. You can also effectively disable auto-discovery by setting these to very large numbers.


  • max_format_len = <integer 1-300>
    • Sets the maximum length of the punct attribute to consider when adding event types.
    • The larger the value, the more attention is paid to the structure of events versus keywords.
    • Defaults to 10.

Keyword configuration:

  • use_any_keyword = true/false
    • If set to false, only keywords in the known_keywords list are used for generating event types.
    • Otherwise, all keywords are considered for event type discovery.
    • Defaults to true.
  • ignored_keywords = <comma-separated list of terms>
    • Enter a list of keywords for Splunk to ignore while creating event types.
    • Defaults to list in $SPLUNK_HOME/etc/bundles/default/eventdiscover.conf (see below).


  • known_keywords = <comma-separated list of terms>
    • If use_any_keyword is set to false, Splunk will look at the list of known_keywords for creating event types.
    • No default value.
  • min_events_to_add_keyword = <integer 1-inf.>
    • Only consider a keyword for defining an event type if it occurs in more than N of the events that match that event type.
    • Defaults to 100.

Other values for tuning:

Please note: the following settings are for more advanced configurations. In most cases, you should not need to modify these settings. If you would like help modifying these values, please contact Splunk support.



  • learned_eventtype_priority = <integer 1-10>
    • The priority value for learned event types.
    • A lower value means lower priority.
    • Defaults to 1.


  • learning_delay_sec = <integer 0-inf.>
    • Write out newly discovered event types no sooner than every N seconds.
    • If this value is too small, the search user may experience slowness as new event types must be reloaded every time new event types are discovered.
    • Defaults to 120.
  • min_percent_for_keyword = <integer 1-100>
    • Only consider a keyword for defining an event type if it occurs in more than N percent, and less than 100-N percent, of the events in an event type.
    • Defaults to 40.
  • min_percent_for_tag = <integer 1-100>
    • Only consider keywords for a event type's tag, if that keyword occurs in more than N percent of events that match that event type.
    • Defaults to 99.
  • min_format_count_to_make_event = <integer 1-inf.>
    • Only consider making an event type if more than N events have that punct format, where only the first max_format_len characters are considered.
    • Defaults to 100.
  • min_format_count_before_split = <integer 1-inf.>
    • If a given punct format matches more than N events, consider splitting it up into smaller, more specific, event types with keywords.
    • Defaults to 400.
  • max_memory = <intege 1-inf.>
    • The maximum number of events to keep in memory for the discovery process.
    • If this value is too small, more myopic patterns will be discovered.
    • Defaults to 5000.
  • max_keywords_from_event = <integer 0-inf.>
    • The maximum number of keywords per event that are considered by the learning algorithm.
    • Larger values slow learning down, but will allow greater consideration to those terms that are not at the start of events.
    • Defaults to 10.

Example

This is the default configuration for eventdiscoverer.conf.


_actions = new,edit,delete
process_every_n_events = 10000000
learn_every_n_events = 5000
learning_delay_sec = 120
use_any_keyword = false
max_format_len = 10
min_events_to_add_keyword = 100
min_percent_for_keyword = 40
min_percent_for_tag = 99
min_format_count_to_make_event = 100
min_format_count_before_split = 400
max_memory = 5000
max_keywords_from_event = 10
learned_eventtype_priority = 1
ignored_keywords = sun, mon, tue, tues, wed, thu, thurs, fri, sat, sunday, monday, tuesday, wednesday, thursday, friday, saturday, jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec, january, february, march, april, may, june, july, august, september, october, november, december, 2003, 2004, 2005,
 2006, am, pm, ut, utc, gmt, cet, cest, cetdst, met, mest, metdst, mez, mesz, eet, eest, eetdst, wet, west, wetdst, msk, msd, ist, jst, kst, hkt, ast, adt, est, edt, cst, cdt, mst, mdt, pst, pdt, cast, cadt, east, eadt, wast, wadt, about, after, again, against, all, almost, already, also, although, always
, among, an, and, any, anyone, are, as, at, away, be, became, because, become, becomes, been, before, being, between, both, but, by, came, could, does, during, each, either, else, ever, every, following, for, from, further, gave, gets, give, given, giving, gone, got, had, has, have, having, here, how, how
ever, if, in, into, is, it, itself, just, keep, kept, like, made, make, many, might, more, most, much, must, neither, none, nor, noted, now, of, often, on, only, or, other, our, out, owing, perhaps, please, quite, rather, really, regarding, said, same, seem, seen, several, shall, should, show, showed, sho
wn, shows, similar, since, so, some, sometime, somewhat, soon, such, than, that, the, their, theirs, them, then, there, therefore, these, they, this, those, though, through, throughout, to, too, toward, under, unless, until, upon, use, used, usefulness, using, various, very, was, we, were, what, when, whe
re, whether, which, while, who, whose, why, will, with, within, without, would, yet, net, org, com, edu, co
Revision: 207 | Contact | Privacy Policy | Terms of Use | Community content licensed under Creative Commons