When architecting a Splunk deployment, there is almost always a requirement to support syslog event streams from many devices. While Splunk can easily accept syslog data directly from these external devices, you may be wondering if there are best practices around this. For you long time Splunk users, this should be old news and possibly a refresher. For you new Splunk users, read on…
So what do the experts do with Network Inputs?
First, I’ll defer you to another post about deciding when to use a Forwarder to route the data into Splunk: http://blogs.splunk.com/2011/10/24/choosing-a-forwarder-or-not/. There are important concepts in that blog that will help decide your best setup with respect to forwarders. So let’s think about the layers outside of the forwarder. Particularly, let’s consider the most basic setup where you have a network stream (syslog over UDP or TCP) sending data directly to a single Splunk indexer. If a network problem were to occur, there would not be much we can do to handle or recover the missing events. These events literally get dropped at the network level. If Splunk happened to require maintenance such as a restart, the stream of network events would get dropped at the destination although we can handle the scenario with one of the following setups:
- Install a Splunk intermediate forwarder between the indexer and network device
- Install an outside piece of software that will also sit between the indexer and network device
Both of the above solutions will queue the data and ultimately Splunk will pick up where it left off. However, you may want to be a bit more thorough and persist the network events to disk first. Instead of having Splunk queue the data with another forwarder, you can use a lighter piece of software like syslog-ng or rsyslog to simply write the events to the local filesystem. Once the data is in a file, Splunk can leverage it’s file input technology to read in this data set. This is the preferred method for handling network based events, whether you install the syslog server closer to the source (network devices) or to the destination (indexers).
It is important to note that a lot of customers already have a syslog server that writes their network based events to disk. If you are one of these customers, you can leverage some of the available documentation on syslog-ng (Balabit) to make your life easier when configuring this type of setup:
Actual configuration examples as well as topologies can be found in the aforementioned document.
- Persist Network events to disk when possible
- Leverage intermediate forwarders to distribute the events across a cluster of indexers
- Do not tune any forwarder settings unless you know what you are doing