CLOUD

Introducing Splunk AWS Serverless Apps

In the midst of announcements at AWS re:Invent 2017, AWS introduced the AWS Serverless Application Repository which enables AWS customers to easily discover, deploy, and publish serverless applications for common use cases like data processing, stream processing, IoT device data telemetry, and more.

We’re very excited about this release, and we’re proud to be a launch contributor by providing several free serverless applications. With a point-and-click flow, end-users can immediately start streaming data to Splunk Enterprise or Splunk Cloud from various data sources such as Amazon S3, Amazon DynamoDB Stream, Amazon Kinesis Stream and AWS IoT via Splunk HTTP Event Collector (HEC) for near real-time and in-depth analysis.

 

AWS-ome…How Can I Get Started?

If you don’t have access to AWS Serverless Application Repository, sign up for the private Preview.

Once accepted into the Preview, go to your AWS Lambda console, select ‘Serverless Application Repository’ when creating a new function, then search for ‘splunk’. You’ll see several purpose-built serverless apps.

Select one of the apps, such as splunk-elb-application-access-logs-processor which processes access logs of AWS ELB Application load balancers as they get written to Amazon S3 and then stream to Splunk. Once selected, you’ll be prompted with a simple form with basic configurations parameters, namely:

  • Enter a name for a new Amazon S3 bucket where access logs for your application load balancers will be stored

  • Enter an optional prefix and suffix to limit to objects under a particular path or a particular file extension (‘.log.gz’ is initially set as suffix to match access log filenames of application load balancers)

  • Enter the URL endpoint and token of your Splunk Enterprise or Splunk Cloud HTTP Event Collector (HEC)

The serverless app will automatically create & configure all necessary resources to stream the data to your existing Splunk deployment, including:

  • Amazon S3 bucket to store the access logs, with object notifications configured as Lambda event trigger.

  • Lambda function to process access logs, including uncompressing the records, parsing fields, and streaming to Splunk. The Lambda function pre-packages the client JavaScript library for Splunk HEC.

  • AWS Lambda policy to allow the Amazon S3 bucket to trigger the AWS Lambda function

  • AWS Lambda IAM execution role & policies to give the Lambda function necessary permissions

Once the app is successfully deployed, refer to Enable Access Logging to have your application load balancers store access logs in the newly created S3 bucket you specified. As soon as an access log gets written into that S3 bucket, it will now be parsed and forwarded to Splunk by the Lambda function. ELB publishes a log file for each load balancer every 5 minutes, so allow for an initial delay of few minutes before you start seeing data in Splunk Enterprise.

Below is the ELB Traffic Analysis dashboard from the popular Splunk App for AWS, where we analyze and visualize traffic of an application load balancer using the access logs delivered to Splunk via the serverless app we just deployed. We can quickly gain a lot of insights, from average request round-trip time to request volume in terms of request count and ingress/egress data volume. We can also visualize geographically where the requests are coming from, and monitor for errors.

In our test environment, we terminated one of the backend instances which explains the spike of errors in the Error Count timechart below:

As you're familiar with Splunk Enterprise, we can immediately drill down to the actual raw events by clicking on that datapoint to investigate further. Below we see the specific 12 failed requests with ELB log entry fields automatically extracted (with the pre-installed Splunk Add-on for AWS). We can see the breakdown of ELB 50x error codes showing invalid response from the upstream server. We can also confirm all these failed requests had the same target 192.168.1.74, that is the EC2 instance which was terminated.

To summarize, we just went over the 3-step end-user flow:

  1. Select Splunk Serverless App

  2. Configure and deploy

  3. Search and analyze your data in Splunk Enterprise...and profit!

If you want to dive deeper beyond the AWS console UI, all Splunk serverless apps are open-sourced, including the underlying CloudFormation templates which follow the AWS Serverless Application Model (AWS SAM). We welcome your contributions to help extend these apps into additional data sources and use cases that are important to you.

What’s the Difference Between a Serverless App and a Lambda Blueprint?

The existing Splunk Lambda blueprints have also been available directly from AWS Lambda console. As a launch partner, Splunk first released two blueprints in 2015. As serverless adoption grew, Splunk added more purpose-built blueprints based on customer requests for different data sources such as S3 & ELB. Splunk Lambda blueprints are now used by many customers to ingest TBs of data into Splunk.

However, AWS admins still have to manually configure other critical pieces, like IAM & Lambda policies for permissions, Lambda event trigger, etc.—as documented in our step-by-step walkthrough blog post "How to stream AWS CloudWatch Logs to Splunk." Like Splunk, the AWS Lambda team is constantly listening to their customers and partners. In response, AWS created packaging (AWS Serverless Application Model), and now publishing and sharing service (AWS Serverless Application Repository), enabling agile & scalable development and deployment of serverless applications. In collaboration with AWS, Splunk ported out six of the existing Splunk Lambda blueprints into serverless apps with more to come. Lambda blueprints are still available for those who want to operate at the lower-level Lambda function, and modify Lambda code inline.

What Exactly Happens When I Deploy a Serverless App?

A Serverless App follows the AWS SAM format: it’s a CloudFormation template that packages all resources needed by an AWS customer to deploy a serverless architecture. Customers pay for actual resources created such as the Lambda function with standard pay-per-use AWS pricing. Refer to SAM GitHub repo for more details.

Below is an example of the CloudFormation stack that was created under the hood, when deploying the Splunk serverless app for processing access logs of AWS ELB Application load balancers.

What’s Next?

AWS Serverless Apps for Splunk are about time to value and ease of management of streaming data. The goal is to further enable Splunk and AWS customers to leverage the flexibility, scalability and cost-effectiveness of serverless computing. Check out this past .conf2016 session with Experian, where they reduced data ingestion latency from 15 minutes to sub-5 seconds with an 80th percentile latency of 1.2 second only. Mike Sclimenti from Experian and our own Matt Poland went over how to tune Lambda & Splunk HTTP Event Collector for large scale data ingestion. We can’t wait to see how you’ll be using these apps as part of your own big data ingestion pipelines.

To further help you trace, prioritize and aggregate this data, explore the helpful Data Insights Tool, jointly created by AWS Marketplace and Splunk. Finally, to analyze and deconstruct all this data from AWS services, make sure to check out the various Splunk solutions available in AWS Marketplace including Splunk Insights for AWS Cloud Monitoring.

Sound interesting enough to participate? Sign up for the AWS Serverless Application Repository Preview.

Roy Arsan, Solutions Architect, Splunk
Tarik Makota, Solutions Architect, AWS Partner Network

Roy Arsan
Posted by

Roy Arsan

Roy Arsan is a Senior Solutions Architect part of Global Strategic Alliances. He has a background in product development & cloud architecture, building solutions to accelerate our customers’ success in their cloud journey. He architected Splunk solutions on several cloud providers including AWS & Azure. He’s also the co-author of the AWS Lambda blueprints & Serverless apps for Splunk. He lives in Austin, Texas and graduated from the University of Michigan with a M.S. in Computer Science Engineering.

Join the Discussion