APIs have existed nearly as long as websites themselves. But because APIs are primarily consumed by programs instead of people, they tend to be less visible than applications or sites directly accessed by users. The result: APIs often receive far less attention from a site reliability engineering (SRE) and monitoring perspective than other parts of application environments.
Indeed, virtually every enterprise SRE and IT team today monitors its application services and the infrastructure that hosts them. APIs, however, don’t necessarily feature within monitoring workflows.
This oversight creates tremendous risk for many businesses, which have become centrally dependent on APIs to deliver the applications that their customers expect. If you lack visibility into the availability and responsiveness of the APIs and other third-party services that your applications depend on to do their jobs, you can’t guarantee a positive customer experience.
Fortunately, you can incorporate APIs without having to rebuild your monitoring workflows from scratch. To provide guidance, this article walks you through how to manage the performance of APIs and other third-party services. We’ll cover:
- How APIs work
- The unique monitoring challenges APIs present
- Best practices for maximizing API monitoring as part of a broader performance management strategy
How APIs work today
Short for application programming interface, an API is a service that allows an application to interact with other applications or services. APIs define how different applications or services can exchange requests and data with each other. They make it possible to:
- Ingest data from a variety of other services
- Authenticate users to one application using a directory service that is hosted elsewhere
- Sync calendar events between different applications
In other words, APIs are the glue that bind together the various services and resources that make up modern applications.
Because APIs play such a central role in uniting the different parts of modern applications, monitoring them for performance is critical. A problem with an API means a problem integrating one layer of an application with another, which can quickly translate to major disruptions.
APIs and application performance
Businesses adopt APIs on a massive scale to connect and integrate disparate applications and services. Whether you deploy a variety of applications to power your business or you integrate your applications with third-party platforms, APIs are the foundation for your entire user journey together.
Users may never see APIs directly in the way that they see other parts of your application. Nonetheless, a problem with any part of any API becomes a roadblock for the user. To guarantee a positive user experience, then, businesses must be able to guarantee the performance of several aspects of their APIs.
- API availability. First and foremost, APIs must be available. If an API service stops responding, it’s critical for your team to know so that they can direct requests to an alternative service or, at a minimum, notify users of the disruption.
- API responsiveness. The speed at which APIs handle requests plays an important role in the overall user experience. You need to be able to track responsiveness and identify bottlenecks to ensure your users don’t wait longer than they expect when using your applications.
- API security. APIs must be secure. Anomalies in API behavior that could signal misuse of the API service must be addressed to protect against the risk of leaking data to unauthorized parties, or service disruptions caused by malicious users.
(Read about application performance monitoring.)
The challenges of API monitoring
Many modern SRE and IT teams recognize the importance of API monitoring. But they still fail to incorporate APIs adequately into their performance management strategy, due to the special challenges associated with API monitoring.
Let’s look at some of these challenges.
APIs hide in the background
Unlike other application components, APIs are machine-to-machine services, rather than resources that users see directly. That makes it harder in some respects to detect problems associated with them. An application failure could be the result of a problem with an API on which the application depends, or it could just be a problem with the application itself.
You need to dig deeper into your monitoring data to understand when APIs are not working properly.
Complex API service mappings
There is rarely a one-to-one relationship between APIs and services. Instead, multiple APIs may serve a single service, and a single API usually serves multiple services. This means that there is a complex web of relationships to map when trying to understand how an API problem translates to an application problem.
Without the ability to map and interpret these complex relationships, it’s difficult to pinpoint the source of API performance issues or understand their impact on the overall user experience.
Monitoring APIs requires monitoring flow
Similarly, no API is an island. APIs typically connect to other APIs, and it’s only by monitoring the flow of requests across APIs that you can accurately identify bottlenecks and other performance issues.
For example, imagine a shopping app that lets customers find and order items. The application might use a variety of APIs:
- A product search API to display a listing of products
- A product description API to provide details on a product that the customer clicks
- Another API determines whether the product is in stock
- This API enables the customer to purchase it
These APIs don’t call each other directly, but they still depend on each other because the data provided by one API needs to be ingested into another API as the customer journey continues.
From a monitoring perspective, then, teams need to be able to understand how this flow of data impacts performance. Knowing simply that the item availability API performs adequately under generic conditions doesn’t guarantee that it will respond properly when it receives the actual product data produced by a real-world request to the production description API. Teams need to test the entire flow between different APIs, using a variety of variables, to monitor performance reliably.
Multiple API architectures
APIs come in a variety of forms, with each standard or protocol using different methods to structure and exchange data. This diversity adds a layer of complexity to collecting data about API performance. There is no universal method that works for all types of APIs under all conditions.
Monitoring third-party APIs can be particularly challenging because the external platform may limit how much information your team can collect about them. They may not expose logs and metrics, for example. This lack of full visibility means teams are restricted to the information they can collect from the receiving end of the API.
They may also need to rely on synthetic interactions with the API to generate performance data that is not available to them through production data streams.
Thus, there is a solution to the challenges of third-party APIs — synthetic interaction — but it requires teams to take an extra step that they may overlook.
Best practices for API performance management
The challenges described above make API performance management more difficult in some ways than managing other parts of your application environment. Nonetheless, there are several steps that SRE and IT teams can take to understand and optimize the performance of the APIs they depend on, even when they lack as much information as they would ideally have.
Use pre-deployment API testing
Testing APIs before they go into production is one important step toward avoiding issues that will impact your customers. Although testing can’t guarantee that you’ll identify and address all problems before deployment, validating the compatibility of both internal and third-party APIs with your applications under test conditions will significantly increase your chances of catching performance issues early.
Toward that end, take advantage of synthetic API interactions to gather data about API performance within test environments. And, when possible, test your releases with actual APIs to gain greater accuracy into performance and reliability.
Focus on business-relevant API metrics
In both pre- and post-deployment API monitoring, focus on collecting actionable data, like the response rates of different APIs and the number of users or requests impacted by an API problem. Business-centric metrics like these will help you triage API performance issues in order to respond to the most pressing ones first.
Here again, synthetic API monitoring plays a critical role in generating data that you may not be able to collect from real-user interactions — or that you can’t collect early enough in the software delivery lifecycle to find and fix issues before they impact end users.
Map API relationships
Ensure that your monitoring tools and workflows can understand the relationships between APIs within your environment. Know which services depend on which APIs, and how requests can flow between one API and another. This mapping is crucial for pinpointing the source of an API performance problem quickly.
Then, once you think you’ve mapped the relationships, test API interactions to confirm that services actually integrate and interact in the ways your mappings suggest.
Correlate API monitoring data
The data you collect about APIs is but one piece in the observability pie. It’s only by contextualizing that data with other information from your environment — like application metrics and log analytics – that you can isolate API issues from other types of performance problems and understand how API problems impact your applications and users.
In other words, don’t analyze API data in a silo; integrate it into your end-to-end observability workflow.
Performance management requires API monitoring
No performance management strategy is complete if it doesn’t address APIs. Although API monitoring can be challenging, it’s a crucial step for optimizing customer experiences in today’s API-focused world.
With the end-to-end toolchain of Splunk Observability, you can continuously monitor and correlate all your first- and third-party APIs to verify availability, performance, functionality and data quality.
What is Splunk?
The original version of this blog was published by Billy Hoffman. This posting does not necessarily represent Splunk's position, strategies or opinion.