This documentation applies to the following versions of Splunk: 4.0 , 4.0.1 , 4.0.2 , 4.0.3 , 4.0.4 , 4.0.5 , 4.0.6
Splunk is a very flexible product that can be deployed to meet almost any scale and redundancy requirement. However, that doesn't remove the need for care and planning. This article discusses high level considerations for Splunk deployments, including sizing and availability.
After you've worked through the general layout of your Splunk search topology, the other sections in this document can explain more thoroughly how to implement them, along with the formal Admin guide for Splunk.
Let's consider a common, commodity hardware server as our standard:
For the purposes of this discussion this will be our single server unit. Note that the only exceptional item here is the disk array. Splunk is often constrained by disk I/O first, so always consider that first when selecting your hardware.
The first step to deciding on a reference architecture is sizing - can your Splunk handle the load? For the purposes of this guide we assume that managing forwarder connections and configurations (but not their data!) to be free. Therefore we need to look at index volume and search load.
If the answer to both questions is 'NO' then your Splunk instance can safely share one of the above servers with other services, with the caveat that Splunk be allowed sufficient disk I/O on the shared box. If you answered yes, continue.
If the answer to both questions is 'NO', then a single dedicated Splunk server of our reference architecture should be able to handle your workload.
At a high level, total storage is calculated as follows:
daily average rate x retention policy x 1/2
You can generally safely use this simple calculation method. If you want to base your calculation on the specific type(s) of data that you'll be feeding into Splunk, you can use the method described in "Estimating your storage requirements" in this manual.
Splunk can generally, including indexes, store raw data at approximately half the original size thanks to compression. Given allowances for operating system and disk partitioning, that suggests about 500GB of usable space. In practical terms, that's ~6 months of fast storage at 5GB/day, or 10 days at 100GB/day.
If you need more storage, you can either opt for more local disks for fast access (required for frequent searching) or consider attached or network storage (acceptable for occasional searching). Low-latency connections over NFS or CIFS are acceptable for searches over long time periods where instant search returns can be compromised to lower cost per GB. Shares mounted over WAN connections and standby storage such as tape are never acceptable.
If you have requirements greater than 100GB/day or 4 concurrent users, you'll want to leverage Splunk's scale-out capabilities. That involves using distributed search to run searches in parallel across multiple indexers at once, and possibly load balancing the incoming data with auto load balanced Splunk forwarders.
Also, at this scale it is very likely that you'll have high availability or redundancy requirements, covered in greater detail below.
If you do not - i.e. you are between 100GB/day and 300GB/day - you should be able to have multiple dual-purpose Splunk boxes that are searching across each other.

example of a search user searching on one Splunk instance and having their search distributed to other instances
For deployments of 300GB/day or larger, consider a three tier Splunk deployment. In this model, search is separated from index by creating Splunk search heads, or instances of Splunk that only do searching. That allows for more efficient use of hardware, and to scale search usage (mostly) independently of index volume.

Example Splunk distributed topology. This example could handle up to 400GB/day and 8 concurrent search users for common use cases.
At daily volumes above 300GB/day, it makes sense to slightly modify our reference hardware to reflect the differing needs of indexers and search heads. Search heads do not need disk I/O, nor much local storage. However they are far more CPU bound than indexers. Therefore we can change our recommendations to:
Search Head
Given that a search head will be CPU bound, if fewer, more performant servers are desired, adding more and faster CPU cores is best.
Note: The guideline of 1 core per active user still applies. Don't forget to account for scheduled searches in your CPU allowance as well.
Indexer
The indexers will be busy both writing new data and servicing the remote requests of search heads. Therefore disk I/O is the primary bottleneck.
At these daily volumes, likely local disk will not provide cost effective storage for the time frames that speedy search is desired, suggesting fast attached storage or networked storage. While there are too many types of storage to be prescriptive, here are guidelines to consider:
Therefore...
Technically, there is no practical Splunk limitation on the number of search heads an indexer can support, or the number of indexers a search head can search against. However systems limitations suggest a ratio of approximately 4 to 1 for most use cases. That is a rough guideline however; if you have many searchers compared to your total data volume, more search heads make sense, for example.
| Daily Volume | Number of Search Users | Recommended Indexers | Recommended Search Heads |
| < 2GB/day | < 2 | 1, shared | N/A |
| 2GB/day to 100GB/day | up to 4 | 1 | N/A |
| 200GB/day | up to 8 | 2 | 1 |
| 300GB/day | up to 12 | 3 | 1 |
| 400GB/day | up to 8 | 4 | 1 |
| 500GB/day | up to 16 | 5 | 2 |
| 1TB/day | up to 24 | 10 | 3 |
Note that these are approximate guidelines only. You should feel free to modify based on the discussion here for your specific use case, and to contact Splunk for more guidance if needed.
Many Splunk deployments require some form of redundancy, either to protect the data from loss or the search service from outage - and sometimes both. In general Splunk's solution to this problem is a straightforward matter of data duplication, however we will look at three specific deployment possibilities.
The easiest method of ensuring data will not be lost is to have two original artifacts made by cloning data coming from Splunk forwarders.
In this approach, the data is duplicated and available instantly, should you need to cut over to the stand-by Splunk instance. Note that while you can simply have one Splunk forward to the next Splunk (as shown here for the offsite location) to save on network usage, there is a risk on hard shutdown of the last few events not being sent on. If that is acceptable, the topology can be even simpler.
The goal of a high availability deployment is both data survivability and service uptime. To accommodate this kind of deployment, you need to duplicate both the data and the physical hardware providing service, not unlike other web based applications. Also, redundancy needs to be considered for all three tiers of service - splunkweb searching, splunkd indexing and forwarding.
In this topology, there are two data complete functional groups. In the picture both groups are servicing search requests to optimize hardware costs; the second infrastructure could be idle to ensure neither disruption nor degradation of search services.
Splunk has three primary roles - indexer, searcher and forwarder. In many cases a single Splunk instance may two or all three roles at once. All have their own performance requirements, and bottlenecks.
As you can see, disk I/O is frequently the limiting factor in Splunk performance, and deserves extra consideration in your planning. That also makes Splunk a poor virtualization candidate unless dedicated disk access can be arranged.
Based on these estimate, this machine will be disk IO bound' if there are too many active users or too many searches per user. That is the most likely limitation for this hardware, possibly followed by CPU if the searches are highly computational in nature, such as many uses of stats or eval commands in a single search.
With the information above, it is possible to estimate required hardware for most Splunk use cases by considering the following:
Note that not all search users consume the same amount of resources. While their are in depth guides for search cost analysis available here, consider these very rough guidelines.
What does that mean in real life?