TIPS & TRICKS

Splunk ate my homework…

Back when I first joined Splunk, I recall our CEO mentioning how Splunk could do everything – including his son’s homework.  Using Splunk to replace Excel for graphing/reporting is a cute trick, but I never thought it might actually be useful for real homework.   Well, fast forward about 3 years and many use-cases later…

Last week, I was brainstorming with a Master’s degree student about how to gather metrics on their group project.  This class was on Advanced Computer Architecture, at Santa Clara University.  After a few minutes of discussion, the student decided that a script to ingest, parse, and output csv data would be the right solution.   From there, they could then plot things in Excel using the csv file. I commended said student for their computer programming skills and hard work ethic. However, I then loudly cried, “Splunk does all of that!”.   They replied with doubt, but I reassured them I could solve their hours of busy work in about 15 minutes.

I was wrong.   It took me about 30 minutes, but that was because downloading took 5 minutes and the searches required more thinking than I initially thought.  So in this blog, I’ll walk through the steps I took to accomplish this task…

Let us start with the data sample.   There are approximately 860 text files, containing results of various test runs.  The data file is called results.txt, and is populated through various directories (containing additional files) that describe the parameters of each test run.  Splunk can index any textual data and has the capability to be specific about what it indexes.

Here is a sample of the data file:

sim: ** simulation statistics **
sim_num_insn               12863937 # total number of instructions committed
sim_num_refs                3981186 # total number of loads and stores committed
sim_num_loads               2707356 # total number of loads committed
sim_num_stores         1273830.0000 # total number of stores committed
sim_num_branches            1849006 # total number of branches committed
....
....
ruu_occupancy                2.4577 # avg RUU occupancy (insn's)
ruu_rate                     0.6956 # avg RUU dispatch rate (insn/cycle)
ruu_latency                  3.5329 # avg RUU occupant latency (cycle's)
ruu_full                     0.0000 # fraction of time (cycle's) RUU was full
LSQ_count                  18201329 # cumulative LSQ occupancy
LSQ_fcount                        0 # cumulative LSQ full count
lsq_occupancy                0.7169 # avg LSQ occupancy (insn's)
lsq_rate                     0.6956 # avg LSQ dispatch rate (insn/cycle)
lsq_latency                  1.0306 # avg LSQ occupant latency (cycle's)

Within each results.txt file, there are approximately 70 lines that contain parameters we want to report on.   Within each directory, exist other files that I probably don’t want to index.  The file path (800+) looks as follows:

/results/whetstone_80/in_order_true/pipeline_integer_8/pipeline_fpu_8/mem_port_64/branch_predictor_2lev/results.txt
/results/whetstone_80/in_order_true/pipeline_integer_8/pipeline_fpu_8/mem_port_64/branch_predictor_bimod/results.txt
/results/whetstone_80/in_order_true/pipeline_integer_8/pipeline_fpu_8/mem_port_64/branch_predictor_nottaken/results.txt
/results/whetstone_80/in_order_true/pipeline_integer_8/pipeline_fpu_8/mem_port_64/branch_predictor_perfect/results.txt
/results/whetstone_80/in_order_true/pipeline_integer_8/pipeline_fpu_8/mem_port_64/branch_predictor_taken/results.txt
/results/whetstone_80/in_order_true/pipeline_integer_8/pipeline_fpu_8/mem_port_8/branch_predictor_2lev/results.txt

Note that each sub-path identifies a different test run, with varying input parameters. At this point, I came to the following conclusions about how to tackle the problem:

  1. Use a whitelist for results.txt file
  2. Turn off line-merging, so each line is treated as an individual event
  3. Use CRC salt as the file header could be the same, or they could have duplicate results in their tests
  4. Create a simple regex to extract a field to report on their specified parameter

In Splunk, I could do most of this through the GUI.   This includes setting the appropriate root path (/results) and the whitelist (results.txt).   There are additional settings such as the CRC salt that is manual.   Note, I also sourcetyped this input to be “results”.  By doing this, I am classifying this data source and making it easier to identify through my searches.  The $SPLUNK_HOME/etc/apps/search/local/inputs.conf file looks as follows:

[monitor:///Users/syep/bd/results]
disabled = false
followTail = 0
whitelist = results.txt
crcSalt = <SOURCE>
sourcetype = results

The next step is to force Splunk to treat each line as an individual event. This is because we have individual lines with unique parameters we want to report on. I used the following settings in my $SPLUNK_HOME/etc/apps/search/local/props.conf file:

[results]
SHOULD_LINEMERGE = false

So with all these changes in hand, I then restarted Splunk to implement the inputs and props changes. It’s important to note I did not do this through the GUI, as adding the input would immediately launch indexing of all files. A simple search for “sourcetype=results” yielded the raw data in the correct form within the Splunk UI. The next challenge was creating a way for this student to report on individual result parameters. A simple keyword search (of the parameter), in combination with a field extraction (regex) would do the trick. My first example reports on a keyword parameter “lsq_latency”. The raw event looks as follows:

lsq_latency                  1.3190 # avg LSQ occupant latency (cycle's)

To extract the 1.3190 as a field value, I used the rex command (you could also use the Interactive field Extractor, through the GUI):

rex "\S+\s+(?\S+)\s#"

The final output needed to be a csv file showing the result value and the path name. The reason the path name is important, is that it shows the test parameters used in each test run. My final search to create a table output for lsq_latency was as follows:

sourcetype=results lsq_latency | rex "\S+\s+(?\S+)\s#" | table source result_value

I then saved this search as an example, so that the keyword parameter could be modified for easy reproduction.  All the student would need to do, is change the keyword everytime they want a different report.  The final piece was to create a csv file, which is as simple as using the pull-down menu to export the results. The final output looks as follows:

With all of the fields created, we can now perform any type of reporting or analytics against these test runs.  For now, we have just created a table view to report and sort on specific parameters.  Now you can say it is true, Splunk ate your homework.

----------------------------------------------------
Thanks!
Simeon Yep

Splunk
Posted by

Splunk

Join the Discussion