I’ve had this script kicking around for a while now, but never get around to publishing it… in the interest of getting it done, this post will be brief.
You may be aware that in Splunk 4.1, we introduced a completely rewritten Tailing Processor (the component that handles file monitor inputs). The rewrite included a prototype REST endpoint that provides realtime status of the Tailing Processor’s activities. It can be seen at https://localhost:8089/services/admin/inputstatus/TailingProcessor:FileStatus (on a stock installation), but quickly becomes unreadable with a large number of files being monitored.
The script (linked below) summarizes all of the entries at the endpoint, as such:
Some quick details about the output:
- Updated: when the status was last fetched, as well as how long the fetch/parse took.
- Dirs seen: number of directories the Tailing Processor knows about, whether ignored or monitored.
- Finished files: number of files that were fully read, and whose file descriptors have since been closed.
- Reading/open files: currently open files that are not at 100% completion yet. Also includes 100% completed files whose descriptors are still open temporarily (consider this to mean “files we’re just about done with”).
- Ignored items: files or directories that have been scanned, but not read. As listed in the screenshot, this can mean files that the splunkd process doesn’t have permissions to read, files that are considered binary, etc.
Things can look a bit more interesting if you catch a large file in progress. Here, we have a ~1GB file at 10% completion – as the tool refreshes, this percentage will adjust accordingly:
- Simply run the script through the Python interpreter included with Splunk (you cannot use your system Python).
In the commands below, replace the bolded portions with appropriate paths for your installation:
Unix: /opt/splunk/bin/splunk cmd python /path/to/fileStatus.py
Windows: c:\program files\splunk\bin\splunk cmd python c:\temp\fileStatus.py
- Accepts “-interval #” where # is an integer. This sets how many seconds the script will wait before refreshing the endpoint. Defaults to 1.
- Accepts “-clear true|false”. If false is passed in, the terminal will not be cleared before refreshing the endpoint. This can be used to track long-term behavior of the Tailing Processor. Defaults to true.
- Accepts “-uri <uri>” to allow for pointing the script at another Splunk instance (see my other posts).
- Remember that the endpoint is a prototype, and thus has minor bugs – but you can basically trust it. For example, sometimes you’ll see a file completion percentage larger than 100% – this just means the file keeps growing. Eventually it will be labelled 100% again. As Deep has been known to say, “that <stuff> happens, be cool about it.”
- The script can work with a very large number of files being monitored, but the current implementation will chew through RAM. The largest I tried was 450,000 files, which took up a couple of gigs of memory.
- If you look at the main() implementation in the file, you’ll see a hacky way to create your own python Splunk CLI command, taking advantage of the CLI’s auth features and whatnot. Not that you would need to, but I think it’s cool.
Download: here (md5sum: 217418d8c1a88632a6d28685ee28e7c9)
Better late than never, yes?