AWS Lambda Monitoring with Splunk Infrastructure Monitoring

We’ve seen a dramatic increase in interest from our customers for using serverless computing, and AWS Lambda in particular. Given the benefits that serverless can provide, this interest isn’t terribly surprising; especially when you can pay for only what you use, scale up or down immediately to match supply with demand, and avoid operating any server infrastructure at all.

As with any new runtime environment, reaping its promised benefits can often hinge on how well integrated such an environment is with your existing developer toolchain, and in our case – how easily it can be monitored.

With that in mind, we partnered with several of our customers to develop monitoring for Lambda with their real-world use cases in mind. We helped them get at-a-glance information about how their organization is using Lambda, see how it is performing for them in real-time, and — most importantly — get the same high resolution, low-latency custom application and business metrics for their Lambda functions that they are accustomed to for their other application code.

CloudWatch and the Lambda Navigator

Our starting point for Lambda monitoring is the same as for any other AWS service:CloudWatch metrics. Using our standard AWS integration, we poll the relevant CloudWatch metrics and make them available in both our Infrastructure Navigator and in several new built-in dashboards.

As with any other service that Infrastructure Navigator lets you visualize, we give you advanced filtering capabilities for drilling down to the subset of functions you are interested in, grouping them using a variety of standard dimensions or the custom AWS tags that you’ve applied, and changing the time span that you are looking at.

When you select "All Lambda Functions", you see a summary dashboard below. Not only does this provide point-in-time information about your functions, but it also shows you the trends, so you can see a breakdown of functions by account or by region, or invocations and errors by function. The illustrations below show the dashboards in a larger format, available on the "Dashboards" page.

When you select a particular function, we’ll drill down to a dashboard about that specific function.

Splunk Wrapper

As useful as the basic CloudWatch metrics can be, there are often times when you need to capture details about invocations or errors in a more timely, more granular fashion. To address this we’ve made a wrapper available that includes calls to the Splunk service with the count of invocations and errors, the execution duration, and whether the functions being called are being impacted by a cold start.

Today, we have wrapper code that supports Lambda functions written in Java, Node.js, and Python, and we’ll be adding more language support soon. These wrappers are currently easy to access via GitHub and will be made available more broadly soon.

In addition, the wrappers provide an easy mechanism to instrument your code for the custom metrics that matter to you. It’s a simple matter of adding a few additional lines of code within your function to capture and send out those metrics:

We’ve also gone to great lengths to make sure those metrics can get to Splunk Infrastructure Monitoring with the fine resolution and low latency that you have grown to expect from Splunk. Other monitoring approaches rely on writing the custom data you want to CloudWatch logs, scraping those logs on a periodic basis, and then waiting for the resulting metrics to make their way through a slow pipeline — an approach that results in many minutes of extra lag before your metrics are visible. In contrast, Splunk has deployed additional infrastructure within each of the AWS regions where Lambda functions are available, in an effort to minimize the additional function execution time required to send out the metrics, as well as the lag associated with getting a metric datapoint into the Splunk service.

All the Usual Splunk Benefits

As always, getting metrics into Splunk Infrastructure Monitoring is just the starting point for monitoring your functions. Once the data is available, not only will you get the dashboards we’ve pre-built for you, but you can also use the data in custom charts with advanced SignalFlow analytics, or set up real-time alerts that will warn you about issues before they begin impacting your customers.

We’ve Made It Easy to Get Started

AWS Lambda offers big benefits to development teams that just want a place to run their code without having to worry about, well, anything else. Splunk is the first vendor to deliver a truly real-time monitoring service that matches the speed, scale, and variety of data you need insights into. That’s why we encourage you to try Splunk Infrastructure Monitoring alongside your new Lambda functions.

It’s as easy as 1) signing up for a free 14-day trial and 2) installing the Lambda integration from our Integrations page, or placing your Lambda functions in our Lambda wrapper available on GitHub.

As always, if you have any questions or need help with your monitoring projects – let us know here.

Patrick Lin
Posted by

Patrick Lin

Patrick leads Product and Partnerships at SignalFx. Previously, he held senior leadership positions in Product Management at Jive Software and VMware.
Show All Tags
Show Less Tags