Documentation: 3.2.3
Print Version Contents
This page last updated: 03/03/08 03:03pm

Configure segmentation

Segmentation rules can be tweaked to provide better index compression or improve the usability for a particular data source. If you want to change Splunk's default segmentation behavior, edit segmenters.conf. Once you have set up rules in segmenters.conf, tie them to a specific source, host or soucetypes via props.conf. NOTE: You can enable any number of segmentation rules applied to different hosts, sources and/or sourcetypes in this manner.

There are many different ways you can configure segementers.conf, and you should figure out what works best for your data. Specify which segmentation rules to use for specific hosts, sources or sourcetypes by using props.conf and segmentation. Here are a few general examples of configuration changes you can make:

Inner segmentation

Inner segmentation is the most efficient segmentation setting, for both search and indexing, while still retaining the most search functionality. It does, however, make type-ahead less comprehensive. Also, inner segmentation disables the ability to click on different segments of search results, such as the 48.15 segment of the IP address 48.15.16.23.

To configure inner segmentation, change minor breakers to be major breakers in segmenters.conf. Under these settings, Splunk indexes smaller chunks of data. For example, user.id=foo is indexed as user id foo.

To enable inner segmentation, add the following code to $SPLUNK_HOME/etc/bundles/local/segmenters.conf:

[inner]
MAJOR = [ ] < > ( ) { } | ! ; , ' " * \n \r \s \t / : = @ . ? - & $ # + %  \\ %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520
MINOR =  

Once you have made this change, however, you will no longer be able to search for minor segments or major segments containing breakers. For example, you can no longer search for 10.1 or 10.1.2.5. You can still search for your terms by enabling phrase search -- surrounding your searches in quotes. For example, you can search for the IP address 10.1.2.5 by searching 10 1 2 5.

Note: If your search terms have breakers in them, you must remove them before executing a phrased search.

To enable phrase search, add the following lines to $SPLUNK_HOME/etc/bundles/local/segmenters.conf:

[search]
MAJOR = [ ] < > ( ) { } | ! ; , ' " * \n \r \s \t / : = @ . ? - & $ # + %  \\ %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520
MINOR =

[full]
MAJOR = [ ] < > ( ) { } | ! ; , ' " * \n \r \s \t & ? + %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520
MINOR = / : = @ . - $ # % \\ _

Outer segmentation

Outer segmentation is the opposite of inner segmentation. Instead of indexing only the small tokens individually, outer segmentation indexes entire terms, yielding fewer, larger tokens. For example, "10.1.2.5" is indexed as "10.1.2.5," meaning you cannot search on individual pieces of the phrase. You can still use wildcards, however, to search for pieces of a phrase. For example, you can search for "10.1*" and you will get any events that have IP addresses that start with "10.1".

To enable outer segmentation, add the following lines to $SPLUNK_HOME/etc/bundles/local/segmenters.conf:

[outer]
MAJOR = [ ] < > ( ) { } | ! ; , ' " * \n \r \s \t & ? + %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520
MINOR =

No segmentation

The absolute most expedient setting is to disable segmentation completely. There are significant implications for search, however. For example, setting Splunk to index with no segmentation, restricts your searches to time, source, host and sourcetype. Only use this setting if you do not need any advanced search capabilities. To enable this configuration, add the following lines to $SPLUNK_HOME/etc/bundles/local/segmenters.conf:

[none]
MAJOR =
MINOR =
MAJOR_COUNT = 0
LOOKAHEAD = 0
MINOR_COUNT = 0

This example removes all major and minor breakers.

No segmentation is the most space efficient configuration, but makes searching very difficult. You must need to pipe your searches through the 1 search command in order to further restrict results. This type of configuration may be chosen in an environment where storage efficiency is valued over search performance.

Splunk Web segmentation

Splunk Web also has settings for segmentation. These have nothing to do with indexing segmentation. Splunk Web segmentation affects browser interaction and may speed up search results.

To configure segmentation, click on the Preferences tab in the upper right-hand corner of Splunk Web.

http://www.splunk.com/assets/doc-images/3_2ConfigSegmentation/splunkwebseg.jpg

You can set segmentation to:

  • Raw
    • Turns off segmentation entirely. This is the fastest setting, but also disables clicking on events.
  • Inner
    • Only minor breakers. Described above.
  • Outer
    • Only major breakers. Described above.
  • Full
    • Default with inner and outer segmentation enabled.
  • Pyramid
    • A useful setting for visualizing an event's segmentation.
Previous: Configure event boundaries    |    Next: Enable custom segmentation

Comments

No comments have been submitted.

Log in to comment.