Preview [ Preview documentation: caution, tech writers working. ]
Print Version Contents
This page last updated: 06/30/08 04:06pm

Summary indexing

Summary indexing provides support for greater efficiency when running reports on large datasets over large time spans. Summary indexing saves the results of a scheduled search into a special summary index that you designate. You can then search and run reports on this smaller, specially generated summary index instead of working with the much larger original data set.

You can use summary indexing to:

  • index aggregate results
  • index running statistics (such as a running total)
  • index rare original events into a smaller index for more efficient reporting

For example, you may want to run a report at the end of every month that tells you how many page views and visitors each of your Web sites had, broken out by site. If you just run this report at the end of the month, it could take a very long time to run because Splunk has to look through a great deal of data to extract the information you want. However, if you use summary indexing, you schedule a saved search that runs periodically over smaller slices of time and Splunk saves the results (since the last time the report was run) into a special (summary) index. You can then run an "end of the month" report on the data indexed in this much smaller index.

Or, you may want to run a report that shows a running count of a statistic over a long period of time. For example, you may want a running count of downloads of a file from a Web site you manage. Schedule a saved search to return the total number of downloads over a specified slice of time. Use summary indexing to have Splunk save the results into a summary index. You can then run a report any time you want on the data in the summary index to obtain the latest count of the total number of downloads.

How summary indexing works

Summary indexing is an alert option for scheduled saved searches. When you run a saved search with summary indexing turned on, its search results are temporarily stored in a file ($SPLUNK_HOME/var/spool/splunk/<savedsearch_name>_<random-number>.stash). From the file, Splunk adds general information about the current search and the fields you specify during configuration (using the addinfo command) to every result and indexes the results as events in a summary index (index=summary by default).

Note: Use the addinfo command to add fields containing general information about the current search to the search results going into a summary index. General information added about the search helps you run reports on results you place in a summary index.

After results are indexed in the summary index, you can search and report on them by specifying the name of the summary index in your search.

Example:
This example searches for all events in the summary index and returns events from the most common referers.

* index=summary | top refererSearch

Search commands useful to summary indexing

Summary indexing uses some new search commands behind the scenes to perform its actions.

  • addinfo: Summary indexing uses addinfo to to add fields containing general information about the current search to the search results going into a summary index. Add | addinfo to any search to see what results will look like if they are indexed into a summary index.
  • collect: Summary indexing uses collect to index search results into the summary index. Use | collect to index any search results into another index (using collect command options).

Another useful command is overlap. You can use overlap to find gaps in events or overlapping events in a summary index.

  • overlap: Use overlap to identify gaps and overlaps in a summary index. overlap finds events of the same query_id in a summary index with overlapping timestamp values or identifies periods of time where there are missing events.

How to configure summary indexing

Configure summary indexing in Splunk Web before you customize it in savedsearches.conf. Learn how to configure summary indexing.

Previous: Search command: rawstats    |    Next: Configure summary indexing

Comments

No comments have been submitted.

Log in to comment.