If you were always wondering how much data was being transferred between your forwarders and indexers, we may have some help for you. Splunk now publishes these metrics to metrics.log, which are by default tailed and indexed in “_internal”.
Splunk uses a component called TcpOutputProcessor, which is configured using outputs.conf, to forward data to another Splunk or non-Splunk entity. This is something that a lot of people also refers to as a forwarder. Each TcpOutputProcessor instance publishes metrics events every 30 seconds – all the fields of these events are described below:
- group=tcpout_connections – this field discriminates this event as being a TcpOutput metric.
- tcpout_group_name:destIp:destPort – the load-balanced group that this metric belongs to. If you have multiple groups defined, a separate event is published for each of those groups.
- host metadata – is always available in an event, and refers to the host on which the forwarder is running.
- sourcePort – the local port that is used to connect to the remote entity.
- destIp – the ip address of the remote server to which events are being forwarded.
- destPort – the destination port on which events are being forwarded.
- tcp_bps – bytes per second averaged over last 30 seconds.
- tcp_kbprocessed – total KBytes processed since this connection went live.
- tcp_eps – events per second averaged over last 30 seconds.
- tcp_dropped_events – number of events dropped on this connection.
Similarly on the indexing side, if you have configured inputs.conf to receive data from one or more forwarders, a metrics event is published every 30 seconds for each connection into your indexer. All the fields of a metrics event on the input side are described below:
- group=tcpin_connections – this field discriminates this event as being an input metric.
- sourceHost – The hostname of the entity that is forwarding data to this indexer. If hostname is not available, then it’s IP address is used.
- sourcePort – The remote port of the forwarding entity.
- destPort – The local port on the input side for which this metric is being collected. Typically this port is defined in inputs.conf.
- tcp_bps – bytes per second averages over last 30 seconds.
- tcp_kprocessed – KBytes processed since the connection was established.
- tcp_eps – Events per second averaged over 30 seconds.
These metrics will now enable you to get unusual insight into the operation of your forwarders and indexers. Here’s a sample query that you can run on each indexer instance to get a report on thruput by each forwarding entity:
index=_internal metrics "group=tcpin_connections" | timechart span=30s avg(tcp_bps) by sourceHost
Also, I created a saved search, and used Splunk’s reporting features to always show me the current status on a dashboard.
Now that you have all of this nice data, I am sure you would like it all aggregated in one location.
Good luck playing with these metrics, and if you have any suggestions on what more you would like to see, drop me a line at email@example.com.