Tips & Tricks

November 21, 2011

2 Minute Read

What’s Your ulimit?

By Splunk

If you don’t know the answer to that question, you should go into the corner for a 5 minute time out. 😉 No need to beat yourself up for not knowing. It’s not something most people would think to check when deploying Splunk. Since it usually rears its slightly-monstrous-yet-interesting head when system load creeps higher, let’s just set it and forget it. Or for a little added drama, address it when Splunk crashes or hangs.

If *nix is your operating system then you need to worry about this. For Windows, you probably have other things to worry about so I don’t think this a concern. From the handy man pages, ulimit is short for user limits for system-wide resources. To Splunk, these user limits equate to file descriptors, how many files can be open simultaneously.

Splunk will allocate file descriptors for:

files being actively monitoring
forwarder connections
deployment clients
users running searches

The default ulimit value is often 1024. Even if Splunk allocates just a single file descriptor for each of the activities above it’s easy to see how even a few hundred files being monitored, a few hundred forwarders sending data, a handful of very active users on top of reading/writing to/from the datastore can easily exhaust this measly default setting.

Well, Splunk doesn’t just allocate a single file descriptor for everything it touches. Here are additional details on how to size ulimit with consideration for what might go wrong with forwarders and deployment clients. This will help you adjust the ulimit to something sensible for your current system and projected growth.

Setting ulimit on Indexers

When all is humming along, each forwarder will require 2 file descriptors–1 for a data and 1 for a health check connection. So the minimum ulimit setting is 2 x # forwarders.

When things hit a snag and forwarders are unable to connect to an indexer (e.g. an indexer is offline for updates or fails), a forwarder can trigger allocation of up to 5 file descriptors on retries–4 for data and 1 for health check. This means the open file descriptor count can potentially reach 5 x # forwarders. This is the theoretical max.

Therefore, a super safe ulimit will be 8 x # forwarders to account for the additional file descriptors Splunk will need for reading/writing during indexing/searching. This setting is very important for indexers as we are expecting constant concurrent connections from forwarders.

Setting ulimit on the Deployment Server

The importance of increasing the ulimit for deployment servers is lower than for indexers because deployment clients are much more bursty and quick in their communication with the deployment server. This means the likelihood all deployment clients will all check in at once and exhaust file descriptors is much lower.

The low water mark is 2 x # deployment clients and a safe ulimit is 4 x # deployment clients.

Why Not Set ulimit to Unlimited?

It doesn’t hurt to remove the hard limit and set ulimit to unlimited… unless there is some kind of file descriptor leak in Splunk. Such a leak can go undetected for a long time and consume more and more resources. We don’t expect this to happen since we do monitor specifically for these types of problems in our longevity tests conducted with 1000 forwarders across 10 indexers over many days with ulimit set at 2048.

Additional Hints

As studly Splunk Admins you probably already know how to do this, but in case triptophan has an early/lasting grip on you:

The ulimit is set in increments of 1024, so please round your calculations up to the next 1024 increment.
Use “ulimit -n” to find the current max number of FDs for your system.
To change the max open files, here is a good guide: http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files.

Many thanks to Jag Kerai, Splunk Superstar Developer, for this incredibly useful insight and guidance! And Happy Thanksgiving!

----------------------------------------------------
Thanks!
Vi Ly

Splunk

The world’s leading organizations trust Splunk to help keep their digital systems secure and reliable. Our software solutions and services help to prevent major issues, absorb shocks and accelerate transformation. Learn what Splunk does and why customers choose Splunk.

Tips & Tricks 1 Min Read

Help! I can’t export more than 10,000 events!

Tips & Tricks 4 Min Read

Monitoring Network Traffic with Sysmon and Splunk

Tips & Tricks 3 Min Read

Look at all the pretty colors!

Splunk Power User Bootcamp at .conf2014 decides to assign colors to event types in standard web-based GUI; used users, events & ‘risklevel’ & assigned colors.

About Splunk

The world’s leading organizations rely on Splunk, a Cisco company, to continuously strengthen digital resilience with our unified security and observability platform, powered by industry-leading AI.

Our customers trust Splunk’s award-winning security and observability solutions to secure and improve the reliability of their complex digital environments, at any scale.

Learn more about Splunk

Subscribe to our blog

Get the latest articles from Splunk straight to your inbox.

Connect with Splunk on X

Follow @Splunk

Connect with Splunk on Instagram