Specify data inputs via Splunk's CLI or Splunk Web. You may also use inputs.conf (read more about how to configure inputs via inputs.conf). Changes made via Splunk Web or the Splunk CLI are written to $SPLUNK_HOME/etc/system/local/inputs.conf. Configure Windows inputs via inputs.conf as well.
Read on for a description of Splunk's data input types, including their purpose and behavior.
Files and directoriesData inputs can come from files and directories. Use monitor for continuous, non-destructive inputs from files and directories. Use batch input for one time, destructive file loading.
MonitorSplunk's monitor behaves like the UNIX tail command. Specify a path to a file or directory and Splunk's monitor processor consumes any new input. If subdirectories exist within the specified directory, Splunk recursively examines them for log files. Splunk automatically adds any new files into the index.
In addition, when monitoring a file:
Note: If you are monitoring large files or archives, removing the input does not stop those files being indexed. This does stop files from being checked again, but all the initial content will be indexed. To stop all in-process data, you must restart the Splunk server.
When monitoring a directory:
Note: If the specified file or directory does not exist, the Splunk server will not check to see if it is created later. Splunk only checks for files and directories each time the Splunk server starts (or is restarted). So be sure to explicitly add new files as inputs when they become available if you don't want to restart the server. When monitoring a file, the entire path dir/filename must not exceed 1024 characters.
Batch uploadUpload files directly through Splunk Web. If necessary, Splunk unpacks and uncompresses files before indexing.
Use the batch processor at the CLI or in inputs.conf to load files once and destructively. By default, Splunk's batch processor is located in $SPLUNK_HOME/var/spool/splunk. For continuous, non-destructive loading of files, use monitor.
FIFO queuesA FIFO (AKA named pipe) is a queue of data maintained in memory. File systems can write log messages directly to a FIFO. Splunk then accesses the FIFO as though it were a file. When choosing the FIFO data input method, consider the following:
Note: FIFOs are not recommended for application servers forwarding data to Splunk in a distributed setting. Monitor is a more reliable, stable method.
Network portsUDP and TCP ports can feed data into the Splunk Server. UDP and TCP behave differently, and these behaviors affect how data arrives for processing. When configuring network ports, keep in mind that you cannot use ports lower than 1024 if you have not installed Splunk as root.
UDPUDP is a best effort protocol. This means that you might not get messages if the network is clogged, or has a hiccup. You also can't be absolutely sure the messages aren't spoofed or altered in transit. UDP should be reserved for logging implementations focused on day-to-day troubleshooting rather than compliance or security.
Splunk with an Enterprise license can read directly from the network on any UDP port. Use this configuration to make Splunk act directly as a syslog server by reading remote syslog events on UDP port 514. You can also send any other UDP source of logging data, including SNMP.
Like all network streaming approaches, direct UDP input is higher performance than reading files from disk.
TCPTCP is a reliable, high-performance choice for many situations, as this protocol includes checks to ensure that data has arrived safely and intact. Splunk with an Enterprise license can receive data on any TCP port, allowing Splunk to receive remote data from syslog-ng and other syslog implementations that use TCP for security or reliability. TCP is the foundation of Splunk's distributed data access.
Note: If the sending process buffers data such that events are broken into multiple pieces, Splunk may interpret the parts as multiple events. This is more likely if events are being generated intermittently, as there may be long pauses (several seconds or longer) between blocks of buffered data. If you notice truncated events, try forcing the process to send events atomically.
Scripted inputsConfigure Splunk to run shell commands on a schedule, and then index the output. For example:
See Configure scripted inputs for details on how to set this up.
Indexing propertiesSplunk can process any data, regardless of format and it automatically learns event boundaries, classifies events and sources, and finds timestamps. However, sometimes you may want to customize Splunk's default processing. Change processing settings and indexing properties in props.conf.
Some attributes within props.conf can be customized by defining new stanzas in other configuration files. For example, transforms.conf defines regex-based rules for extracting fields, correlating events and performing other transformations. Segmenters.conf and outputs.conf can also define attribute values referenced by props.conf.
Common use cases for custom indexing properties include:
Comments
No comments have been submitted.