Splunk's indexing performance can be maximized by tweaking settings in Splunk's configuration files. Here are some basic tweaks you can implement to improve indexing performance:
Splunk has several internal processors. If you notice that Splunk isn't indexing your data as you like, you can track down exactly which processor is responsible for the delay by running the following search:
This search shows you a chart of Splunk's internal processors. If one processor in particular is taking up more cpu time than another, you can tweak settings to reduce this.
Below are some tuning parameters in Splunk's configuration files that affect indexing performance.
indexes.confindexes.conf controls how Splunk's indexes are configured. You can change the following entries to improve indexing performance.
| indexThreads = <non-negative number> (0) | The number of extra threads to use for a specific index. Turning up the number of index threads will improve indexing, but is dependent on the capability of your hardware. It is not recommended to turn up index threads to be greater than the number of processors in the server that this instance is running on. For example, a single core system should never be set to higher than 1 |
| maxMemMB = <non-negative number> (50) | Amount of memory to allocate for indexing. This amount will be allocated per index thread. For example, if you have indexThreads set to 2 and maxMemMB set to 300, you will be using 600 MB of memory |
| maxDataSize = <non-negative number> (750) | Max amount of data in MBs db hot can grow to. Values larger than the default are not recommended unless you have a 64-bit system. |
props.conf controls what parameters apply to events during indexing based on settings tied to each event's source, host, or sourcetype.
| DATETIME_CONFIG = <filename relative to Splunk_HOME> (/etc/datetime.xml) | Specifies the file to configure the timestamp extractor. This configuration may also be set to "NONE" to prevent the timestamp extractor from running or "CURRENT" to assign the current system time to each event. |
| TIME_FORMAT = <strptime-style format> (empty) | Specifies a strptime format to extract the date. Specifying a strptime format for date extraction accelerates event indexing. |
| MAX_TIMESTAMP_LOOKAHEAD = <integer> (150) | Specifies how far into an event Splunk should look for a timestamp. If you know your timestamp is in the first n characters of the event, set this to n. This will increase the speed of indexing. |
segmenters.conf defines schemes for how events will be tokenized in Splunk's index.
| MAJOR = <space separated list of strings> | Move MINOR breakers into the MAJOR breaker list, or remove breakers in the MAJOR breaker list to change the size and amount of raw data events. |
| MINOR = <space separated list of strings> | Remove the MINOR= string of characters that represent tokens to index by in addition to the MAJOR breaker list. Reduce or remove this list to increase indexing performance. |
Read more about how to configure custom segmentation.
Comments
No comments have been submitted.