DEVOPS

Working with Splunk Indexes using Windows PowerShell

In my last post, I talked about a way to use PowerShell to ease the installation of our Splunk App for VMware. This time, we’ll be using PowerShell in a much different way. As you might already know, the Splunk dev team has made a very robust set of REST API hooks for the product. What you may not know is that this enabled some other talented guys to build a PowerShell module which you can use not only to get data into and out of Splunk, but also to manage your Splunk infrastructure.

Now in my case, I have a goal in mind. I want to answer this question:

How much disk space is being consumed by indexes?

In my lab environment, I have one search head, and two indexers. It’s quite easy to find out the index size if you are running a single-server setup. Details about the indexes are easy to find in the Splunk manager web UI. But in a distributed environment, the indexers don’t have splunkweb turned on. You do have other options, particularly if you have server access. You can either go to the filesystem and look at space consumption that way, or you could execute a splunk CLI command to get the index settings.

But frankly, most of these methods are very Unixy (not that there’s anything wrong with that), even when running Splunk on Windows! I know my way around Unix fairly well, but at heart, I’m a Windows guy. I want to be able to solve my problems using Windows PowerShell, because that’s the tool that I’m most comfortable with.

In case you haven’t already done so, go grab the latest version of the Splunk PowerShell Resource Kit. Once you’ve got that installed, you’ll be able to follow along with my examples below.

Step one: Retrieve index objects

PS> $cred = Get-Credential
PS> $idx = 'bd-idx-01.bd.splunk.com', 'bd-idx-02.bd.splunk.com'
PS> $idx | Foreach-Object { Get-SplunkIndex -ComputerName $_ -Cred $cred }

And the (trimmed) output from the last line looks like this:

ComputerName            Name
------------            ----
bd-idx-01.bd.splunk.com _audit
bd-idx-01.bd.splunk.com _blocksignature
bd-idx-02.bd.splunk.com _audit
bd-idx-02.bd.splunk.com _blocksignature

Step two: Examine the output

Now that’s great, but where is the size? Remember—everything in PowerShell is an object. Let’s use the Get-Member cmdlet to examine the output from the Get-SplunkIndex cmdlet:

PS> $indexes = $idx | % { Get-SplunkIndex -ComputerName $_ -Cred $cred }
PS> $indexes | Get-Member -Name *size*

   TypeName: Splunk.SDK.Index

Name               MemberType   Definition
----               ----------   ----------
blockSignSize      NoteProperty System.Int32 blockSignSize=0
currentDBSizeMB    NoteProperty System.Int32 currentDBSizeMB=79
maxDataSize        NoteProperty System.String maxDataSize=auto
maxTotalDataSizeMB NoteProperty System.Int32 maxTotalDataSizeMB=500000
rawChunkSizeBytes  NoteProperty System.String rawChunkSizeBytes=131072

First, I’m grabbing all the indexes and assigning them to the variable “$indexes”, because I’m going to be manipulating them a bit more as we continue. Next, that variable gets piped to Get-Member, which spits out tons of stuff. Because my goal is to look at index size, I decided to filter what Get-Member would return to show only those members which have the word “size” in the name.

Step three: Output

Looks like “currentDBSizeMB” is what I need, let’s put that into a nice table!

PS> $indexes | Select-Object -First 2 | Format-Table Name, currentDBSizeMB -AutoSize

Name            currentDBSizeMB
----            ---------------
_audit                       79
_blocksignature               1

Step four: Working with an index object

Before I leave you, let’s do something a bit more useful. Here are my top 10 indexes by size, grouped by indexer.

PS> $indexes | Sort-Object -Property currentDBSizeMB -Descending | Select-Object -First 10 | Sort-Object -Property ComputerName | Format-Table -GroupBy ComputerName Name, currentDBSizeMB -AutoSize

   ComputerName: bd-idx-01.bd.splunk.com

Name            currentDBSizeMB
----            ---------------
servervirt_perf            5913
xenapp_perfmon             3865
_internal                  2573
cisco_ucs_perf             5921
main                      19895
perfmon                   10120
hyperv_perfmon             7255

   ComputerName: bd-idx-02.bd.splunk.com

Name            currentDBSizeMB
----            ---------------
xenapp_perfmon             2531
cisco_ucs_perf             5775
servervirt_perf            4413

One of my favorite things about PowerShell is the pipeline. Funny how a line of code in PowerShell looks pretty similar to a Splunk search command!

If you’d like to learn more about the PowerShell Resource Kit, be sure to read the README which has links to tons of resources. Also, I interviewed Brandon Shell a bit ago about the project on the PowerScripting Podcast episode 165.

----------------------------------------------------
Thanks!
Hal Rottenberg

Splunk
Posted by

Splunk

TAGS
Show All Tags
Show Less Tags