TIPS & TRICKS

More frequent alerts with CLI dispatch

The saved search scheduler that the UI uses runs into trouble when you start running a bunch of searches at the same time. It kicks off one, waits for it to return or timeout and then moves on to the next. If the searches take more than a few seconds to run or there are dozens of them all with high frequency, it gets overloaded. One way to address this is to take advantage of the new dispatch (asynchronous search.) Dispatch is what is behind the REST API search functions and you can also get to it from the CLI with the “dispatch” command instead of the old “search.”

Old CLI search:

./splunk search “sourcetype=access_combined googlebot | stats count” -maxresults 500
count
—–
213

New CLI search:

./splunk dispatch “sourcetype=access_combined googlebot | stats count”
count
—–
213

While the results look the same for this simple search, there is a lot different going on behind the scenes. The search command needs to load all the events it touches into memory, so there is only so much of the index it can search at one time. The data generation part, before the pipe, will only return maxresults number of events, which may not be all of them. If you then filter with additional search commands you won’t get all of what you think you should. You can increase maxresults (default for the CLI is 100) but you can only push it so much until you run into memory problems.

The dispatch search kicks off a job that runs until completion, no matter how long it takes. But one thing to keep in mind is that CLI dispatch is designed for reporting: the actual results are all in memory so you can’t get back thousands of results from a single search. Use reporting commands like stats or narrow your searches so they won’t have more than a couple hundred results. (If you need more, write something that uses the REST API where you have access to job control.)

So how this applies to alerting:

In the UI, when a scheduled search runs, it uses a search command to actually generate the alert. There are a couple different ones, but as most people want an email I’ll focus on sendemail. (Docs here: http://www.splunk.com/doc/3.3/user/UnsupportedCommands#sendemail.)

Any search can use the sendemail search command, it’s not limited to the UI. So I can do this:

./splunk dispatch “error | sendemail to=sysadmins@example.com from=splunk@example.com”

This runs the search and then looks for a mail server (by default on the local machine) to send the message. Since it’s using dispatch, you can kick off a bunch of these and they will all run independently of each other. You can look at the jobs from the REST endpoint:

https://localhost:8089/services/search/jobs

Splunk Atom Feed: jobs
Updated: 2008-07-14T10:39:16-0700 Splunk build: 38343
dispatch
cursorTime 1969-12-31T16:00:00.000-08:00
error
eventCount 316
isDone 1
isFinalized 0
isPaused 0
isStreaming 0
keywords sudo
resultCount 100
sid 1216057125.31
ttl 3570.9 seconds
events – results – timeline – summary –
control:

2008-07-14T10:38:47.000-07:00 | admin

Here’s an example I set up on my local machine, an OS X 10.5 box which uses postfix. I’ve already made sure postfix is running and I can receive mail to my local account.

I wrote a script that does 50 searches, all set to alert with an email address. Note the auth in the command, if you aren’t already authenticated you will need to use the auth command as part of the CLI search. In a production environment, you would want a more sophisticated means of handling login credentials than sticking plaintext into a script. (You could also use a restricted user created only for CLI searches.)

[root]:/opt/splunk3.3/bin$ more alert_overload.sh
./splunk dispatch “sudo | sendemail to=feorlen from=foo01” -auth admin:changeme&
./splunk dispatch “sudo | sendemail to=feorlen from=foo02” -auth admin:changeme&
./splunk dispatch “sudo | sendemail to=feorlen from=foo03” -auth admin:changeme&
./splunk dispatch “sudo | sendemail to=feorlen from=foo04” -auth admin:changeme&
./splunk dispatch “sudo | sendemail to=feorlen from=foo05” -auth admin:changeme&
./splunk dispatch “sudo | sendemail to=feorlen from=foo06” -auth admin:changeme&
./splunk dispatch “sudo | sendemail to=feorlen from=foo07” -auth admin:changeme&
[…]

When I run this script, it starts up all these searches. (Note that each one starts up another python! Keep that in mind.) When they complete, they send an email alert.

N 16 foo13@AndreasSplunkP Mon Jul 14 10:59 393/280230 “Splunk Results”
N 17 foo17@AndreasSplunkP Mon Jul 14 10:59 393/280230 “Splunk Results”
N 18 foo11@AndreasSplunkP Mon Jul 14 10:59 393/280230 “Splunk Results”
N 19 foo32@AndreasSplunkP Mon Jul 14 10:59 393/280230 “Splunk Results”
N 20 foo09@AndreasSplunkP Mon Jul 14 10:59 393/280230 “Splunk Results”
? s* dispatch_test.mbox
“dispatch_test.mbox” [New file]
? x
AndreasSplunkPowerbook-2[feorlen]:~$ grep ^From: dispatch_test.mbox | wc -l
50

The messages don’t arrive in the same order, but they do arrive. For these 50 test searches, it was about 20 seconds for all of them. More complicated searches will take longer. One thing to know is that if you are searching faster than it can complete, as in every minute you start a search that takes two minutes to run, they will back up and take a while to complete. There is no hard guideline, as it depends on the individual searches and the overall load on the instance.

Splunk
Posted by

Splunk

Join the Discussion