CLOUD

Get Complete Hybrid Visibility in Splunk by Querying AWS CloudWatch Logs Insights

It’s the week after Thanksgiving...which can only mean one thing: Splunkers arriving in masses in Las Vegas to attend AWS re:Invent!

Over the years, Splunk and AWS have partnered on many integrations and product innovations to empower our joint customers to optimize their cloud journeys and make their machine data actionable. This year is no different. Now you can query Amazon CloudWatch Logs Insights—a new feature launched today—to combine CloudWatch Logs with the data across your hybrid environment for maximum visibility and compute resource efficiency.

There exist many ways to get your CloudWatch Logs into Splunk, including the AWS Add-OnAmazon Kinesis Data Firehose and Lambda + Splunk HEC. Many of our joint customers already combine their CloudWatch Logs with logs and metrics across their entire IT stack to monitor, investigate, and drive the availability and performance of their hybrid environment in Splunk—ultimately achieving observability across their IT infrastructure. When customers want to have access to the raw log files from CloudWatch, it makes a ton of sense to ingest all of the logs using this four-step process:



Streaming CloudWatch Logs to Splunk is relatively simple to set up, but some customers may want to avoid duplicating data or require a simplified process for real-time analysis. Also, high-volume data sources like CloudWatch Logs may require compute-intensive, long-running aggregation queries to populate summary indexes, metric indexes and data models. Therefore, not all customers require the raw log files and would prefer summary status queried from CloudWatch Log Insights.

The following code is provided as an example; with this sample integration with CloudWatch Logs Insights, you can aggregate your CloudWatch logs in AWS and query summary results in Splunk without moving large amounts of data between storage and analytics systems, freeing up resources, and potentially reducing storage and admin cost. This is by no means an exhaustive set of capabilities for integrating Splunk with CloudWatch Log Insights and should not be considered an officially supported product at this time. It also provides customers with faster access to logs by removing the associated data transfer latencies and eliminates the operational complexities of configuring and maintaining certain data transfers.

This integration is great for:

  1. Customers already exporting CloudWatch Logs into Splunk Enterprise or Splunk Cloud and wanting a rich user experience during investigation and troubleshooting. 
  2. AWS customers new to Splunk who rely on CloudWatch Logs to collect and store operational data, but require additional querying and correlation techniques from a third-party tool—like Splunk—that goes beyond the capabilities of CloudWatch Log Insights (CWL-I).

Get the sample code here.

At a high level, this integration transforms the process of getting CloudWatch logs into Splunk from four steps to one step:



How Does It Work?

CloudWatch Logs Insights gives you the ability to search and visualize logs from CloudWatch, including VPC flow logs, Route 53 logs, CloudTrail Logs, Logs in JSON format, Lambda logs, and other log types. You can search one log group at a time, drill down into individual log events, and export your query results to CloudWatch Dashboards to view. Query capabilities of CloudWatch Logs Insights are now available for programmatic access through the AWS SDK and API so our joint Splunk and AWS customers can perform advanced analytics and correlation with other datasets already residing in the Splunk platform.

The Splunk sample integration with CloudWatch Logs Insights eliminates the need to move or duplicate certain data into Splunk. Customers can query and analyze CWL in-place from a Splunk Search Head. Customers can express their queries in familiar Splunk search language to correlate, join and enrich CloudWatch Logs with the data that is already indexed into Splunk. They can selectively offload queries to CloudWatch Logs Insights when the raw data isn’t available in Splunk and as needed stream only the results of those queries to a metric/summary index.

Here is a video overview of the integration:

How to Send Metrics to Your Own Splunk Enterprise Instance Using the Splunk and CloudWatch Logs Insights Script:

  1. Follow the Splunk documentation here to enable HTTP Event Collection and create a HEC token.
  2. On your Splunk Enterprise deployment machine, make sure port 8088 is open.
  3. Copy the code from here to a new file “loginsights_connector.py”
  4. In the same directory as loginsights_connector.py put the code from here in a new file named ingest.py
  5. Run the script with the hec_token from your deployment. If Amazon returns data, that data should now be ingested to your Splunk Enterprise deployment as metrics.
    1. Ex: python loginsights_connector.py "2018-10-08 07:00:00" "2018-11-08 09:46:08" vpcMetricBin "my.splunk.host" 35b5a4b4-0726-4dbe-984b-359b79051377
    2. python loginsights_connector.py "<search window start datetime YYYY-MM-DD HH:MM:SS>" "<search windows end datetime YYYY-MM-DD HH:MM:SS>" <query name> <splunk host> <splunk hec token>
  6. Some sample Log Insights queries and script commands for storing results in Splunk are included in loginsights_connector.py
     

How to Add New Queries:

  1. Find the comment reading "valid query names", in loginsights_connector.py. Add a new constant (for example, VPC_METRIC_SRC_QUERY) with the query name you want.
  2. Add a new key-value pair to the VALID_QUERIES dictionary, associating your query name with the query you want to run.
  3. Add another key-value pair to the VALID_LOG_GROUP_NAMES dictionary, associating your query name with the required log group.
  4. Test the script by running it with the new query_name as a parameter.
     

How to Check Whether Your Metrics Got Into Splunk Enterprise:

  1. Go to the search and reporting app in Splunk Enterprise.
  2. Run this command to view the contents of your metrics index:
    1. | mcatalog values(_dims), values(metric_name) where metric_name=* AND index=<your metrics index> AND sourcetype=<your new sourcetype>
  3. You should see returned values that correspond to the data fields you requested in your LogInsights query.

While this integration tutorial is not yet supported by Splunk, AWS and CloudWatch Logs will continue to support the existing integration options with Splunk today—the Splunk Add-On for AWS and Amazon Kinesis Data Firehose.

Andi Mann
Posted by

Andi Mann

Andi Mann, Chief Technology Advocate at Splunk, is an accomplished digital business executive with extensive global expertise as a strategist, technologist, innovator, marketer, and communicator. For over 30 years across five continents, Andi has built success with startups, enterprises, vendors, governments, and as a leading research analyst. Andi has been named to multiple ‘Top …’ lists and is the co-author of two popular books, 'Visible Ops – Private Cloud' and and 'The Innovative CIO'.

TAGS

Get Complete Hybrid Visibility in Splunk by Querying AWS CloudWatch Logs Insights

Show All Tags
Show Less Tags

Join the Discussion