Diagraming Splunk’s data-flow (part 2 – performance overlays)

In my previous post “Diagraming Splunk’s data-flow” I wrote a small python script that parsed Splunk’s runtime environment ($SPLUNK_HOME/var/run/splunk/composite.xml) and generated a file which when input into graphviz would generate a nice architectural diagram of how pipelines and processors are wired together.

In this installment, I took it to the next level by using Splunk’s search capability to overlay performance metrics on the diagram. The combination of Splunk logging metrics information for each processor within each pipeline (thanks Brad) and the ability to have Splunk execute a search processor written in Python made this possible. Here is how you use it:

First download graphviz. I particularly like the OSX application that they’ve written because you can see the graph on the screen and as the file changes, those changes are reflected in the graph you are viewing. If you don’t have a Mac, use the command line version to generate different types of output file formats like .jpeg, etc.

Go to SplunkBase to download my python script. Copy the .py file into $SPLUNK_HOME/etc/searchscripts

Start Splunk.

Type the following into the search box:index___internal metrics pipeline processor NOT get - over all time - localhost - Splunk 3.2-UNSTABLE-4.jpg
This will search for the appropriate metrics information and pipe the results through the script.

There are 2 options to perfgraph:

perfgraph [output filename] [cpu, execs, cumhits]

Unfortunately (because I’m lazy) you can’t specify cpu, execs or cumhits without also specifying an output file.The parameter is the full path and file name of the ‘dot’ file you wish to create. It defaults to /tmp/

The second parameter, if specified tells the script to highlight in red the slowest processor (cpu), the processor with the most hits (execs) or the processor with the most cumulative hits (cumhits). This parameter defaults to ‘none’, or no highlighting.

The above search string results in the following graph (portion). Notice the performance information overlayed into the processors:

If you specify the output file and ‘cpu’, the processor with the most cpu time will be highlighted. Here is the search:

index___internal metrics pipeline processor NOT get | perfgraph cpu - over all time - localhost - Splunk 3.2-UNSTABLE.jpg

It results in the following graph (portion). Notice the red processor:

Next steps:

  • Overlay queue metrics into the queue nodes
  • Overlay indexer throughputs into the indexer nodes

You see. Splunk provides endless fun. Insane! Enjoy.

Posted by