CloudWatch logs offers a great way of collecting all of your performance and operational logs from your AWS environment into one location. With this being a flexible platform, many sources of logs can be collected into multiple log groups, with each potentially having differing sources, and therefore different log formats. For example, VPC Flow logs, CloudTrail and RDS logs all have different log structures. This post explores how any log files from Cloudwatch can be ingested into Splunk regardless of the format, and where it is possible to extend/vary the example given for other use cases.
Splunk offers many ways of ingesting AWS data; via the AWS Add-On, serverless with Lambda or Kinesis Firehose, or even automated and serverless with Project Trumpet. However, Kinesis Firehose is the preferred option to be used with Cloudwatch Logs, as it allows log collection at scale, and with the flexibility of collecting from multiple AWS accounts.
One of the Firehose capabilities is the option of calling out to a Lambda function to do a transformation, or processing of the log content. This allows you to open up the message package picked up from the Cloudwatch logs to re-format this into a Splunk Event. This also opens up some additional flexibility to set the event information such as source, sourcetype, host and index based on the log content.
We’ve already seen one version of this transformation in action when Splunk and AWS released the Firehose integration. The lambda function used for that example extracts VPC Flow logs that can then be sent to Splunk. This function is available as an AWS Lambda blueprint - kinesis-firehose-cloudwatch-logs-processor or kinesis-firehose-cloudwatch-logs-processor-python. This blog takes a step further, providing a basis for a common log collection method into Splunk that can be used for ANY of your Cloudwatch logs.
As a recap, the architecture of how to ingest logs with Firehose is shown below:
Most of what is needed to setup Firehose and Splunk can be followed from this earlier blog. You will also need to refer to the setup process described here, noting the different steps to take after those listed within the mentioned blog, and adding a new Lambda Function.
What Does The New Lambda Function Do?
In the previous blog, the Lambda function template extracts individual log events from the log stream and sends them unchanged to Firehose. As these are simple VPC Flow logs (not in JSON format), the content is easily sent as Raw events to Splunk. Some key things to note with the standard template:
- The event details of the sourcetype, source, host and index can only be set in the HEC settings
- Some of the source information for the log is also “lost” – for example, the Log group name or region isn’t passed into Splunk. Where this is important is for examples where the log doesn’t contain reference to the AWS Account number, log group or the origin of the log. For these logs, there is no way of tracing back the source of the log unless individual Firehoses and HEC tokens/inputs are created for each individual log group
With the new Lambda function, you can take the log from Cloudwatch and wrap it up as a Splunk HEC Event in JSON format. This adds the benefits of:
- Setting event details of the sourcetype, source, host and index over-riding the values set for the HEC token
- Retain the log source information such as; account, region, log group/stream names, and pass them into Splunk
The example Function template does the following:
- Takes the message bundle sent from Cloudwatch to Firehose, and unzips the data
- Each “message” payload is sent to the “transformLogEvent” function”, formatting the payload message into a Splunk JSON Event format:
- The Host value of the event is set as the ARN of the Firehose stream
- The Source is set as the the value of both the subscription filter name and Cloudwatch Log Group name
- The Sourcetype is set as "aws:cloudtrail" if the Log Group name contains “CloudTrail”, "aws:cloudwatchlogs:vpcflow" if the Log Group name contains “VPC”, or for all other cases set to the value of the environment variable in the Lambda function settings
- The Index is not set in the function, but could easily be set by contents of LogGroup name or Subscription Filter name.
- Returns the results back into Firehose
You can find further details of how to format the HEC Event here.
The example transforming function that we have shows how to format the events sent to Firehose into a Splunk HEC JSON format, setting some of the event details based on the Log information. Further changes to the function are possible to make it more flexible or fit your requirements. Simple changes allow you to be creative with how you set the Splunk Index, host, source and sourcetypes. Here’s two examples of this:
1) If you had RDS instances sending their logs into CloudWatch, you could use the Log Group name so that one Firehose can be used for multiple RDS instances and log types – for example if there were 2 RDS instances with their logs going into Log Groups of
As the audit and error logs are need different sourcetypes, it would be easy to set the sourcetype value based on whether it is an audit or error log. In this case, you could also add additional Log Groups from other database logs by simply adding subscription filters to the same Firehose, and not having to change anything on the Splunk side.
2) Another example could be where you are collecting logs from multiple AWS accounts, but for security reasons we may wish to store these logs in separate Splunk Indexes. As you can set the index value in the transform function, the function could set different index names for the accounts.