
I wrote a WordPress plugin (tested for 2.5.1) that displays my most recent Google search terms in my sidebar. It was an experiment with using the Splunk REST API and the PHP SDK.
You can configure the widget from the Widgets page and it supports multiple instances with different configuration. Right now the actual search string is hardcoded because I’m doing some extra mangling to get the search terms the way I want anyway, but I’ll be adding that to the configuration options also. Eventually there will be a way to cache results so you don’t do the search each time the page is loaded.
Since there is still work to do to make it more generic, I haven’t uploaded it to the WordPress site. But here is the basic PHP code to play around with. In fine programming tradition, I learned quite a lot by picking apart existing WordPress widgets, in this case Random Image and Twitter Tools. This widget requires the Splunk PHP SDK, by default my code is expecting it to be in the same directory (which is probably going to be something like wp/wp-content/plugins/widgetname.) There are a few things it depends on, you can find the details at the Google Code page.
You can find the widget here:
splunk_statsphp1
Note: updated version posted 31 July 08.
Here’s a sample of the kinds of events I’m looking at. I have some extra field extractions because it’s a custom format and not exactly access_combined, but I get the referer in there. What I want to display is the actual search string, in this case “drum+carder”. I have to strip out the ‘+’ between words because otherwise it doesn’t wrap nicely in my narrow sidebar. (I’m sure I could fix this in my theme somehow but Eric Meyer I’m not.)
xxx.xxx.xxx.xxx [15/Jul/2008:12:08:07 -0700] “GET /tag/drum-carder/ HTTP/1.1” 200 “http://www.google.com/search?hl=en&pwst=1&q=drum+carder&start=10&sa=N”
You can go look at the code if you really want to know, but here are a few comments on what it’s doing:
I only want a couple results, so to make the search as fast as possible I’m limiting what I get back.
// how many results to get?
$dispatchProps[‘max_count’] = 3;
Also there’s no need to have the default time to live, so set the timeout to something reasonable. This could be much smaller, even.
// don’t leave the search hanging around
$dispatchProps[‘timeout’] = 300;
It’s a pretty simple search, the auto key/value extraction already gets the q= stuff out of the referer field.
// using head to get only what I want makes the search way faster
$job_id = $searchMgr->syncSearch(‘search sourcetype=”spinnyspinny_access_log” google search | head 3’, $dispatchProps);
Here’s what it looks like in my sidebar:
If you want to see it in action, I have it installed in my personal blog at http://www.feorlen.org. It is pulling statistics about my other site at http://www.spinnyspinny.com, which gets a lot of search engine hits from Google. If you want to test it, search for “spinnyspinny” and some other relevant keywords like “yarn” and you will find my site. Don’t go abusing it now, because you know that Splunk will be telling me your IP!