I've been searching high and low for someone else that has attempted this or confirmed a solution but am running stuck.
I'm forwarding apache access logs from a linux server with a splunk lightweight forwarder. The logs are rolled every day with cronolog. They forward to a single splunk index/search server.
The lightweight forwarder is setup to monitor the logs directory\access_log_* (our access log name takes the form access_log_[date].txt. (This part works great, I'm providing it for context.)
Our webservers are behind a layer 5 switch which uses a simple keepalive.html file to determine if the webserver is available or failover and round robin for load balance. This means that our access_logs are full of gets to the keepalive.html. We need that in the logs to verify content switch functionality when doing routine maintenance but the entries clog up splunk.
I know I can setup a search with | delete and purge those entries from the splunk index or just add a NOT to the search to eliminate those entries, but they take up space, consume forwarding bandwidth, and throw off our total event counts so I'd rather not even have them forwarded.
Is this possible? Is there a way to filter the contents of a log file before forwarding to the splunk indexing/searching server?