Tracking application-level and infrastructure-level metrics is part of what it takes to deliver software successfully. These metrics provide deep visibility into application environments, allowing teams to home in on performance issues that arise from within applications or infrastructure.
What application and infrastructure metrics can’t deliver, however — at least not on their own — is breadth. They don’t provide the end-to-end visibility into the delivery pipeline that businesses need to ensure an effective and efficient software delivery process at all stages. If a delay in a CI/CD process causes a problem, application and infrastructure metrics will do little to help detect or solve it.
That’s why observability data from applications and infrastructure must be coupled with visibility into the DevOps delivery pipeline itself. It’s only by understanding the health of your CI/CD pipeline operations that you can prevent delivery delays that could harm your business.
On top of this, analyzing your delivery pipeline helps eliminate visibility silos between different stakeholders in the DevOps process, while also reducing technical debt and ensuring that DevOps processes are tied to business outcomes.
In this article, we explain why and how to add DevOps pipeline analytics to your observability strategy. We’ll discuss which types of data to collect from your DevOps pipeline, how to monitor and measure it, and where pipeline analytics fits within the bigger picture of end-to-end observability.
What are pipeline analytics?
Pipeline analytics refers to the collection and analysis of data from the DevOps software delivery pipeline, otherwise known as the CI/CD process.
The delivery pipeline is the set of workflows that teams use to develop, test and deploy iterative releases of applications. While pipelines can be implemented in different ways, they typically start with the development of new code to provide specific features or enhancements to an application.
Then, the new code is integrated into the general codebase, after which it is built and tested. As long as the release passes critical tests, it is deployed into a production environment, where data collected about its performance is fed back to the development team to guide the next round of feature improvements.
As code flows through each of these stages in the delivery pipeline, it produces metrics that teams can collect to understand how well the pipeline is operating, identify weaknesses (like bottlenecks or problematic handoffs between teams), and validate whether the timing of pipeline operations reinforces businesses goals (such as issuing application updates with the frequency promised to customers).
Why implement pipeline analytics?
The most obvious reason to implement pipeline analytics is to identify operational issues with the pipeline, such as delays within one stage that slow down the frequency of releases. However, the visibility that pipeline analytics enables provides more than just a better handle of operational processes.
Eliminate visibility silos
A common challenge for DevOps teams is ensuring that all stakeholders — developers, DevOps engineers, SREs and the SecOps team — have across-the-board visibility into the software delivery process. This is difficult not only because these groups don’t typically interact with each other apart from handing off a new release from one team to another, but also because the types of data that one team collects can be difficult for another to understand and operationalize. Developers are interested in data from debugging tools, for example, while SREs and IT engineers focus on metrics from production environments.
When each team collects different types of data and focuses on different goals, it can become challenging to assure that every stakeholder understands the overall health of the delivery pipeline and is prepared for what’s coming next. IT engineers need to know about a problem that’s discovered during testing, for example, if the problem means a delay in the release cycle. Developers need to know about critical performance problems in production within a new release that may require them to rush to issue a fix.
By collecting and measuring data about the pipeline as a whole, then sharing it with all stakeholders, businesses are in a stronger position to ensure that every team within the DevOps process has the information it needs to coordinate its activities with other stakeholders — supporting the overall success of the pipeline.
Your team is only as good as its least visible component. Eliminating visibility silos means all stakeholders can contribute equally well to the software delivery process.
Align software delivery with business objectives
It can be easy for DevOps delivery pipelines to become silos unto themselves, wherein technical teams focus on achieving technical goals purely for their own sake without tying operations to the business value stream. That’s a mistake, of course, because the ultimate goal of every pipeline should be to support business success. It doesn’t matter how quickly a team can build or deploy new releases if the business has other priorities.
By analyzing the delivery pipeline, it becomes easier to understand the key trends within it, and then to validate that those trends reinforce business goals. For instance, if the business plans to implement a major new feature that it has already begun analyzing, visibility into the pipeline helps make certain that the DevOps team is actually ready to deliver that feature according to the business’s goals.
Reduce technical debt
Understanding what is happening within the delivery pipeline is also crucial for identifying inefficient processes that cause technical debt and, in turn, waste resources and lead to delays.
Perhaps your team is running redundant tests, for example, and should eliminate some in order to speed the testing stage of the pipeline. Or perhaps to improve efficiency you could automate a handoff that is taking place manually. It’s only by achieving visibility into the pipeline that DevOps teams can find and eliminate sources of technical debt.
And it’s not only existing technical debt that pipeline analytics helps to address. By achieving visibility into your software delivery operation, you are also in a stronger position to anticipate future needs and avoid the accumulation of technical debt. When you have metrics that show what an effective delivery pipeline looks like, you have foresight into how your delivery pipeline should evolve over time, and which processes within it are at highest risk of causing technical debt.
Operationalize shift left and shift right
Shifting left — which means starting DevOps processes like testing earlier in the pipeline — has become a popular strategy for teams seeking to find and fix issues sooner. So has shifting right, which involves extending pre-deployment processes into the production stage of the pipeline in order to achieve broader coverage of testing and monitoring.
These goals are achievable only if teams understand the nature of their pipelines. You need to know how well the processes that you want to shift left or shift right are performing before you attempt to perform the actual shift. You also need to be able to track the outcome in order to measure the success of the shift.
Here again, pipeline analytics delivers crucial visibility that lays the foundation for pipeline optimization.
Implementing pipeline analytics
The idea of pipeline analytics is not radically new. Indeed, collecting pipeline data is relatively simple, and in some cases, DevOps teams have been trying to analyze their delivery pipelines for years.
Where many of them fall short, however, is in translating the data to actionable use. They leave their data in silos — developers may know what happens in their portion of the pipeline, but they don’t understand the production stages that the IT team oversees, and vice versa. Even DevOps engineers, whose role is nominally to oversee the entire pipeline, often struggle in practice to gain visibility into production processes that IT engineers own.
To make pipeline analytics work in practice, then, it’s essential to collect data about all stages and share it with all teams. It’s equally important to correlate data from different stages in order to understand the health of the overall pipeline, not just its constituent parts.
Pipeline metrics to collect
To make sure that your pipeline analytics process delivers actionable visibility that all DevOps stakeholders can use, you must start by collecting the right pipeline metrics. There are a variety of data sources to leverage for understanding your pipeline.
A good place to start is with your CI/CD pipeline itself, which offers a number of metrics that help to baseline delivery chain effectiveness and efficiency. While you should tailor your metrics in this context to your business goals, the four so-called DORA metrics are a good foundation:
- Deployment frequency: How often do you deploy new releases into production?
- Lead time for changes:How quickly can your team implement a change to an application?
- Change failure rate:How many application changes result in failed releases during testing? How many change attempts fail before even making it to testing?
- Mean time to recover/restore:When a critical problem arises in production, how long does it take the team to identify the issue and push out a fix?
Depending on your business priorities, you may also choose to track pipeline metrics such as:
- Downtime cost: How much does it cost the business if your pipeline stops operating for a specific period?
- Amount of unplanned work: How much time is your engineering team spending on non-development activities such as fire-fighting incidents?
- Branch aging: If your pipeline includes branches, how long does each branch exist before it is integrated back into the main pipeline? Or do you have branches and repositories that have become rogue and potentially a risk?
- Work in progress: How many unreleased changes are in your pipeline at a given time?
These are examples of metrics that provide across-the-board, holistic visibility into the health of your pipeline, while also helping your team home in on sub-optimal pipeline processes. By correlating slow deployment frequency with a high rate of change failures, for example, you can conclude that problematic changes are the root cause of your deployment velocity.
Or perhaps your team is performing large amounts of unplanned work, which suggests that you may need to improve your planning process and continuous feedback loop so that important changes can be planned in advance, rather than implemented on the fly.
Beyond CI/CD metrics
Don’t stop with metrics pulled from CI/CD operations. Any system or process that complements, runs alongside or feeds into your CI/CD pipeline is also a rich data source. For example, consider the following sources of metrics:
- Code repositories: How often is code checked in? How many developers are using each repository? How many repositories do you have in total? Metrics like these provide further visibility into your software delivery process.
- Ticketing systems: Tracking data from your ticketing system such as ticket volume, mean response time, mean resolution time and open-to-closed ticket ratio helps you understand how often incidents arise within your software delivery operation and how effective your team is in responding to them.
- Incident response: Incident response metrics such as mean time to detect and mean time to resolve and the frequency of incident escalations offer another window into the health of your delivery operation.
Thinking broadly about pipeline analytics by incorporating metrics and data sources like these will maximize your breadth of visibility. At the same time, it helps include as many stakeholders as possible in measurement of the software delivery process. When you analyze data from the operations of every developer who uses your repositories, every help desk staff member who troubleshoots end-user problems and every on-call engineer who responds to critical incidents, you have maximum visibility into the overall state of your software delivery operation.
Leveraging Splunk for pipeline analytics
Knowing the measures of success in DevOps is critical to ensuring that your practice sustains and scales. Succeeding at DevOps today means measuring the health not just of your applications and the infrastructure they run on, but also of the pipeline that delivers them. To do this, teams must collect, measure and correlate metrics that help convey what’s happening within each part of their delivery pipelines, as well as how individual stages of the pipeline connect to support the total software delivery process.
As an end-to-end visibility solution, Splunk Observability Cloud can help. By offering not just the depth that teams need to understand their applications, but also the breadth necessary to analyze the delivery pipeline, Splunk offers holistic observability into the entire DevOps process. Leverage Splunk to collect logs and metrics of your software development tools, visualize their output in team level dashboards and extract actionable insights that allow engineering organizations to optimize their software development processes.
To learn more about how Splunk can help your team get business critical visibility into the delivery pipeline at all stages, check out this demo and start your free trial of Splunk Observability Cloud today.
What is Splunk?
The original version of this blog post was published by Chris Riley. This posting does not necessarily represent Splunk's position, strategies, or opinion.