TIPS & TRICKS

Configure Lambda for CloudWatch Metrics Inputs for Splunk

The following is a guest blog post from Iman Roodbaei, Senior Cloud Operational Engineer at General Electric, and Vijay Kota, Splunk Consultant for General Electric.

Please use this blog post in conjunction with the "Configure CloudWatch inputs for the Splunk Add-on for AWS" documentation for any additional reference. 

Why Lambda & Splunk?

  • Fully managed service with serverless architecture and lower cost.
  • Bypass the need for setting up and managing heavy weight forwarder.
  • Extremely scalable and reliable.
  • Well integrated with various data sources.
  • Ability to transform CloudWatch metrics data prior to sending it to Splunk.
  • Log data to identify behavioral patterns, understand application processing flows, and investigate and diagnose issues.

As you can see, we've created the S3 Bucket (cloudwatchmetrics) which uses Notification (SNS) & Query subscription (SQS).

We set up a dead-letter queue for the SQS queue to be used for the input for storing invalid messages. For information about SQS dead-letter queues and how to configure it, see this AWS documentation.

We also configured the SQS visibility timeout to prevent multiple inputs from receiving and processing messages in a queue more than once; we recommend setting the SQS visibility timeout to 5 minutes from now or longer. 

If the visibility timeout for a message is reached before the message has been fully processed by the SQS-based S3 input, then the message will re-appear in the queue to be retrieved and processed again. In that case, we need to assure it's not resulting in duplicate data!

Want more information about SQS visibility timeout and how to configure it? Check out the AWS documentation, "What is Amazon Simple Queue Service?".

Function policy:

{
  "Version": "2012-10-17",
  "Id": "default",
  "Statement": [{
    "Sid": "lambda-58e7749c-51e2-40ab-b1de-beb567661f8d",
    "Effect": "Allow",
    "Principal": {
      "Service": "events.amazonaws.com"
    },
    "Action": "lambda:InvokeFunction",
    "Resource": "arn:aws:lambda:us-west2:1234567890123:function:GatherMetricsAndPostIntoSplunk",
    "Condition": {
      "ArnLike": {
        "AWS:SourceArn": "arn:aws:events:us-west2:1234567890123:rule/ScheduleLambdaCloudWatchMetrics"
      }
    }
  }]
}

 

Execution Role:
{
  "roleName": "lambda_splunk_elb",
  "policies": [{
      "document": {
        "Version": "2012-10-17",


"Statement": [{
            "Sid": "LogThings",
            "Effect": "Allow",
            "Action": [
              "logs:CreateLogGroup",
              "logs:CreateLogStream",
              "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:*:*:*"
          },
          {
            "Sid": "SQSThings",
            "Effect": "Allow",
            "Action": [
              "sqs:ListQueues",
              "sqs:GetQueue*"
            ],
            "Resource": "arn:aws:sqs:*:1234567890123:*"
          },
          {
            "Sid": "SNSThings",
            "Effect": "Allow",
            "Action": [
              "sns:Publish"
            ],
            "Resource": "arn:aws:sns:*:1234567890123:*"
          }
        ]
      },
      "name": "oneClick_lambda_basic_execution_1492625956312",
      "type": "inline"
    },
    {
      "document": {
        "Version": "2012-10-17",
        "Statement": [{
          "Effect": "Allow",
          "Action": [
            "cloudformation:DescribeChangeSet",
            "cloudformation:DescribeStackResources",
            "cloudformation:DescribeStacks",
            "cloudformation:GetTemplate",
            "cloudformation:ListStackResources",
            "cloudwatch:*",
            "cognito-identity:ListIdentityPools",
            "cognito-sync:GetCognitoEvents",
            "cognito-sync:SetCognitoEvents",
            "dynamodb:*",
            "ec2:DescribeSecurityGroups",
            "ec2:DescribeSubnets",
            "ec2:DescribeVpcs",
            "events:*",
            "iam:GetPolicy",
            "iam:GetPolicyVersion",
            "iam:GetRole",


"iam:GetRolePolicy",
            "iam:ListAttachedRolePolicies",
            "iam:ListRolePolicies",
            "iam:ListRoles",
            "iam:PassRole",
            "iot:AttachPrincipalPolicy",
            "iot:AttachThingPrincipal",
            "iot:CreateKeysAndCertificate",
            "iot:CreatePolicy",
            "iot:CreateThing",
            "iot:CreateTopicRule",
            "iot:DescribeEndpoint",
            "iot:GetTopicRule",
            "iot:ListPolicies",
            "iot:ListThings",
            "iot:ListTopicRules",
            "iot:ReplaceTopicRule",
            "kinesis:DescribeStream",
            "kinesis:ListStreams",
            "kinesis:PutRecord",
            "kms:ListAliases",
            "lambda:*",
            "logs:*",
            "s3:*",
            "sns:ListSubscriptions",
            "sns:ListSubscriptionsByTopic",
            "sns:ListTopics",
            "sns:Publish",
            "sns:Subscribe",
            "sns:Unsubscribe",
            "sqs:ListQueues",
            "sqs:SendMessage",
            "tag:GetResources",
            "xray:PutTelemetryRecords",
            "xray:PutTraceSegments"
          ],
          "Resource": "*"
        }]
      },
      "name": "AWSLambdaFullAccess",
      "id": "ANPAI6E2CYYMI4XI7AA5K",
      "type": "managed",
      "arn": "arn:aws:iam::aws:policy/AWSLambdaFullAccess"
    }
  ]
}

 

Modification in Splunk:

Add these line to props.conf:

[aws:cloudwatch:metrics]
SHOULD_LINEMERGE = False
pulldown_type = true
INDEXED_EXTRACTIONS = JSON
ADD_EXTRA_TIME_FIELDS = False
KV_MODE = none
TIMESTAMP_FIELDS = metric_timestamp
#TIME_FORMAT = %s.%Q
TIME_FORMAT = %s
category = Metrics
description = Comma-separated value format for metrics. Must have metric_timestamp, metric_name, and _value fields.

 

 

Current Metrics List:

namespace = 'AWS/EC2'
metric_list = ["CPUUtilization",
  "DiskReadBytes",
  "DiskWriteBytes",
  "DiskReadOps",
  "DiskWriteOps",
  "NetworkOut",
  "NetworkIn",
  "NetworkPacketsOut",
  "NetworkPacketsIn",
  "StatusCheckFailed",
  "StatusCheckFailed_Instance",
  "StatusCheckFailed_System",
  "ProcessedBytes": "Bytes",
  "NewFlowCount": "Count",
  "ActiveFlowCount": "Count",
  "TCP_Client_Reset_Count": "Count",
  "TCP_Target_Reset_Count": "Count",
  "TCP_ELB_Reset_Count": "Count",
  "ConsumedLCUs": "Count",
  "HealthyHostCount": "Count",
  "UnHealthyHostCount": "Count",
  "RequestCount": "Count",
  "HTTPCode_Target_5XX_Count": "Count",
  "HTTPCode_Target_4XX_Count": "Count",
  "HTTPCode_Target_2XX_Count": "Count",
  "TargetResponseTime": "Seconds",
  "TargetConnectionErrorCount": "Count",
  "HTTPCode_ELB_4XX_Count": "Count",
  "HTTPCode_ELB_5XX_Count": "Count",
  "HTTPCode_ELB_2XX_Count": "Count",
]

 

Sample data that our lambda function is sending to S3:

 

 

Here are sample dashboards that were built based on the mstat query:

guest
Posted by

guest

TAGS

Configure Lambda for CloudWatch Metrics Inputs for Splunk

Show All Tags
Show Less Tags

Join the Discussion