Some time last year, I posted some recommendations for running Splunk on Amazon Web Services (AWS). While the base recommendations for how to size and architect Splunk have not changed, we do have more clarity into what works best. Instead of editing that post, I decided that it would be best to review the thought process and give more color to what most people are doing with it. Before going down the road of sizing on EC2, I highly recommend reviewing our standard documentation.
For general sizing purposes, there are two key factors:
- Daily Indexed Volume (how many GB indexed per day?)
- Searching and Reporting needs (how many searches or alerts will be run?)
Most people will already know their expected daily volumes, but the question of how much searching will be somewhat vague. Since adding an instance in AWS is simple, we can use the indexing volume as a solid starting point. Now remember, for increased performance or capacity (indexing or searching), you can simply add another Splunk instance. Below are some base guidelines for deploying.
EC2 Instance Guidelines
- 0.5 – 20 GB/day: 1 Standard Large Instance (4 EC2 Compute Units, 7 GB RAM)
- 21 – 100 GB/day: 2+ Standard Large Instances
- 21 – 100 GB/day: 1 Standard Extra Large Instance
- 21 – 100 GB/day: 1 High CPU Extra Large Instance
- 400 – 500 GB/day: 6 Standard or High CPU Extra Large Instances
- up to 1 TB/day: 11 Standard or High CPU Extra Large Instances
Running smaller and less powerful instances for evaluation purposes is common. If you are evaluating Splunk and just need to get something going, the medium “high cpu” instances are a good starting point since they contain at least 2 cores (minimum requirement). For real world deployments, the high cpu and cluster compute instances can provide better performance although at increased cost ($). This raises a question around what instance provides the best performance per dollar. You’ll notice that in the above guidelines, running at least 2 Large instances (4 EC2 compute units) is comparable to 1 XL instance as far as guidelines go. From the perspective of running a single search, running 2 instances will give better performance. This is because resources can be leveraged in parallel from an I/O and cpu perspective. Indexing performance will also be improved as Splunk would have double the available IOs per second (depending on disk setup). This leads us into storage and disk selections…
EC2 Storage selection
For those evaluating Splunk, using the “local instance storage” is sufficient. You don’t have to use EBS unless you want the associated benefits of performance and reliability. If you are looking for recommended deployment practices, read on… Similar to our standard hardware recommendations, we recommend using fast disks in a RAID 1+0 configuration. We recommend RAID 1+0 as it provides the best I/O performance for reads and writes. Using EBS volumes can provide very acceptable performance when configured in RAID 1+0. To get the levels of suggested I/Os per second, at least 4 EBS volumes will be required to setup the RAID configuration. Using S3 is ideal for backup purposes, although you can deploy an additional EBS volume for backups. S3 is preferred since it spans all zones.
One more thing
Spot vs. Reserved vs. On Demand…. There are many different pricing models for the above instances and choosing the right one can be confusing. The general practice is to deploy reserved instances for Splunk. This is because you will need to persist the data and the reserved type will be cheaper over the long run.
For those planning to deploy on AWS and EC2, follow our standard guidelines and apply them to the available virtual hardware. Running at least 8 cpus (dual quad core), 8 GB RAM, and fast disks is the standard recommendation. There are comparable setups in EC2, such as the Extra Large High CPU instance with attached EBS volumes. If you are simply evaluating Splunk, you can do basic testing on a multi-core instance without any additional storage.
Finally, if you need a more official version of my content as well as some in depth detail, you can find it in our data sheet.