Topics

| pdf version

How Splunk Works


Splunk > The IT Search Company

  • Search and navigate IT data from applications, servers and network devices in real-time.
  • Download Splunk

Localized Splunk documentation

Looking for Splunk documentation in other languages?

Configure segmentation

This documentation does not apply to the most recent version of Splunk.

This documentation applies to the following versions of Splunk: 3.0.2 , 3.1 , 3.1.1 , 3.1.2 , 3.1.3 , 3.1.4

Configure segmentation

Segmentation rules can be tweaked to provide better index compression or improve the usability for a particular data source. If you want to change Splunk's default segmentation behavior, you can edit segmenters.conf. There are many different ways you can configure segementers.conf, and you should figure out what works best for your data. You can also specify which segmentation rules to use for specific hosts, sources or sourcetypes by using props.conf and segmentation. Here are a few general examples of configuration changes you can make:


Inner segmentation

You can configure segmenters.conf to increase indexing efficiency by changing minor breakers to be major breakers. This causes Splunk to index more smaller chunks of data. For example, user.id=foo will be indexed as "user id foo".


To enable inner segmentation, add the following code to $SPLUNK_HOME/etc/bundles/local/segmenters.conf:


[inner]
MAJOR = [ ] < > ( ) { } | ! ; , ' " * \n \r \s \t / : = @ . ? - & $ # + %  \\ %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520
MINOR =  

Once you have made this change, however, you will not be able to search for minor segments or major segments containing breakers. For example, you can no longer search for "10.1" or "10.1.2.5." You can still search for your terms by enabling phrase search -- surrounding your searches in quotes. For example, you can search for the IP address 10.1.2.5 by searching "10 1 2 5".


Note: If your search terms have breakers in them, you must remove them before executing a phrased search.


To enable phrase search, add the following lines to $SPLUNK_HOME/etc/bundles/local/segmenters.conf:


[search]
MAJOR = [ ] < > ( ) { } | ! ; , ' " * \n \r \s \t / : = @ . ? - & $ # + %  \\ %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520
MINOR =
[full]
MAJOR = [ ] < > ( ) { } | ! ; , ' " * \n \r \s \t & ? + %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520
MINOR = / : = @ . - $ # % \\ _

Outer segmentation

Outer segmentation is the opposite of inner segmentation. Instead of indexing only the small tokens individually, outer segmentation will index entire terms, yielding fewer, larger tokens. For example, "10.1.2.5" will be indexed as "10.1.2.5," meaning you cannot search on individual pieces of the phrase. You can still use wildcards, however, to search for pieces of a phrase. For example, you can search for "10.1*" and you will get any events that have IP addresses that start with "10.1".


To enable outer segmentation, add the following lines to $SPLUNK_HOME/etc/bundles/local/segmenters.conf:


[outer]
MAJOR = [ ] < > ( ) { } | ! ; , ' " * \n \r \s \t & ? + %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520
MINOR =

No segmentation

You can configure Splunk to index with no segmentation, in which case you will be able to search only on time, source, host and sourcetype. To enable this configuration, add the following lines to $SPLUNK_HOME/etc/bundles/local/segmenters.conf:


[none]
MAJOR =
MINOR =
MAJOR_COUNT = 0
LOOKAHEAD = 0
MINOR_COUNT = 0

This example removes all major and minor breakers.


No segmentation is the most space efficient configuration, but makes searching very difficult. You will need to pipe your searches through the regex or where commands in order to further restrict results. This type of configuration may be chosen in an environment where storage efficiency is valued over search performance.

Revision: 207 | Contact | Privacy Policy | Terms of Use | Community content licensed under Creative Commons