Now that you have chosen a deployment model, you should consider how you want Spunk to index your data. You have many options for increasing throughput and changing data handling. Pick the implementations that work best for you from the following list.
Create additional indexed fieldsSplunk automatically adds fields such as host::, source::, sourcetype:: and timestamps. Splunk also indexes all of the terms in the entire event so you don't need to define any fields in order to do free form searches. However, if you want to be able to search for the value of a particular field by name, you can create and index custom fields. For example, you may define a pattern to differentiate sourceip and destip values.
Note: Consider whether you want to use search fields or extracted fields for your purpose. Extracted fields reduce indexing time and storage and also provide more flexibility to adapt without re-indexing your data. Search fields are useful for implementing granular access controls. If you decide to go the extracted field route, you don't need to worry about fields in your initial setup as you can add the rules for extraction at any time.
Tweak default processingWhen Splunk indexes a data source, it automatically breaks the input into distinct events and extracts a host and timestamp for the event. The event boundaries, host, and timestamps are important for analysis. If Splunk is not setting the event boundaries or extracting timestamps and hosts correctly, you can easily modify these settings. See timestamp recognition, how host is assigned, and how events are recognized for more information.
Mask sensitive dataYour logs may contain sensitive personal data. For example, there may be social security numbers or passwords in your data that you may wish to cover up. You can create event configuration that masks sensitive data as it is being processed on input.
Change indexing densityWhen Splunk indexes data, it segments events via major and minor breakers. To save storage space on the indexer, you can Splunk's default segmentation settings. For example, web proxy logs may contain lengthy URLs that Splunk breaks into many different minor segments. You may wish to change this setting to eliminate unnecessary overhead.
Eliminate processing stepsCertain processing steps can be eliminated to provide faster indexing and better throughput. For example, if you don't need Splunk to search for timestamps within events, you can turn off timestamp extraction. You can also tune down or eliminate event type auto-discovery.
Filter and route dataYou may find that you wish to handle various data sources differently prior to indexing. For example, you may want to eliminate any data that is not useful, or you may wish to route certain data to specific indexes. You can also route a subset of your data to third-party systems. You will want to decide exactly how to treat your data before setting up your deployment.
Please note: Any data that is removed prior to indexing does not count against your daily licensing limit.
Comments
No comments have been submitted.