Measuring & Improving Observability-as-Service (OaaS) with KPIs and OKRs

Welcome to the third blog of the Observability Center of Excellence (O11y CoE) series! If you’ve been following along, we’ve discussed the why behind an O11y CoE, and we explored how to assemble and structure the team to make it a reality.

Now, we’re ready to dive deeper into one of the CoE’s critical functions: defining and measuring Observability as a Service (OaaS).

In the context of an Observability CoE, OaaS is the operating model for delivering observability capabilities to the organization. Much like other "as a service" models, OaaS focuses on providing observability as a scalable, measurable, and value-driven practice that supports teams across the business.

To determine its effectiveness, it must be instrumented — just like the systems it aims to monitor.

Is your observability practice positioned to help teams resolve incidents faster, reduce downtime, and optimize performance? Defining some base KPIs early in your journey not only helps the CoE answer these questions but also enables it to leverage data to understand what’s working (and what’s not).

These KPIs provide visibility into the CoE’s value, empowering it to continuously refine and improve its delivery of observability services. In this blog, we’ll explore:

By the end, you’ll have the tools and insights to ensure your Observability CoE is delivering measurable value through OaaS, setting the stage for future enhancements like maturity assessments and tactical implementations.

KPIs vs. OKRs: understanding the difference

A fellow Splunker created a great article on KPIs, OKRs, and metrics, breaking down their distinctions and how they complement each other. The gist is simple:

Key performance indicators (KPIs) are like the operational pulse of your observability practice. They answer questions like, “What’s happening right now?” and “What trends have emerged over time?”

These indicators provide a near-time and historical view into the health of your OaaS, helping you identify trends, measure effectiveness, and take action.

Objectives and Key Results (OKRs) are about where you want to go. They combine a clear objective (the goal) with measurable results to ensure progress.

While KPIs tell you what’s happening, OKRs drive strategic alignment and improvements.

How OKRs and KPIs work together for observability

Imagine your Observability CoE tracks a KPI called Agent Saturation, which measures the percentage of available resources instrumented with observability agents. This KPI shows how comprehensively your environment is covered.

The KPI tells you: "We currently have 75% saturation across Tier 0 and Tier 1 applications." In response to this, the related OKR might be:

In this case, the KPI provides the current state and historical context, while the OKR establishes the target state and timeframe for improvement. Together, they ensure the CoE can monitor progress while driving a strategic outcome.

Why both matter

KPIs and OKRs complement each other by ensuring your OaaS practice is operationally effective and strategically aligned:

Together, they create a feedback loop: KPIs inform how close you are to achieving OKRs, while OKRs ensure you’re focusing on initiatives that deliver meaningful value. By distinguishing between KPIs and OKRs, your Observability CoE can build a framework that:

What makes a good KPI?

Any service offering thrives on actionable, meaningful, and relevant KPIs that provide insights into what’s working — and what isn’t. A well-chosen KPI doesn’t just measure performance; it also drives continuous service improvement and supports broader objectives, such as enabling the Observability CoE (O11y CoE) to achieve its OKRs.

(Learn more about KPI management, including how to identify impactful KPIs, avoid common mistakes, and set up KPI management frameworks.)

Common pitfalls to avoid

Defining KPIs is as much about knowing what to avoid as it is about selecting the right metrics. Some common pitfalls include:

The role of the O11y CoE in KPI success

The Observability CoE is central to ensuring success with both KPIs and OKRs. By defining actionable KPIs early and aligning them with clear OKRs, the CoE can:

Defining KPIs isn't just about tracking progress; it's about laying the foundation for a successful Observability-as-a-Service (OaaS) model.

By explicitly integrating OKRs, your O11y CoE gains the ability to continuously adapt, refine, and enhance its value proposition. This alignment ensures that observability practices drive iterative and constant value updates to the business, keeping the organization responsive and competitive.

Categories of observability KPIs

When identifying KPIs for your Observability CoE, it’s useful to group them into categories based on their focus and purpose. To quickly recap, OaaS KPIs should help assess whether your OaaS operating model is effectively delivering, or is positioned to deliver, observability capabilities to the organization.

Organizing KPIs into these categories ensures your measurements are actionable and aligned with the outcomes your Observability as a Service (OaaS) practice strives to achieve.

Later in this blog, I’ll provide specific examples of O11y KPIs, including their descriptions, purposes, calculations, potential data sources, and which category they fall under. For now, let’s explore the core KPI categories:

1. Availability

Focus: Ensuring observability tools and platforms are operational and accessible.

This type of KPI tracks the reliability of your observability ecosystem, helping you answer questions like:

2. Utilization

Focus: Monitoring the deployment and use of observability tools and resources.

Utilization KPIs measure things like license usage, tool versioning, and deployment coverage, ensuring you’re getting the most out of your investments. Key questions include:

3. Adoption

Focus: Measuring engagement with observability tools and practices across teams and environments. Adoption KPIs cover two key dimensions:

4. Optimization

Focus: Enhancing efficiency and reducing noise.

Optimization KPIs evaluate how well your observability practice reduces unnecessary alerts, improves workflows, and minimizes manual effort. These KPIs tackle questions like:

By organizing KPIs into these types, you can align your measurements with the strategic goals of your CoE and your organization.

Examples of KPIs for observability

Now, let's take a look at some specific examples of OaaS KPIs, explaining their purpose, how to calculate them, and some practical “pro-tips” based on my experience.

Click here to expand

Taking the next steps

Now that you’ve explored the critical role KPIs play in defining and measuring Observability as a Service (OaaS), it’s time to put these ideas into action. Here's your call to action:

Start collecting metrics

Begin gathering data for the KPIs we’ve discussed, even if it’s as simple as plugging them into a spreadsheet. This initial step will help your tools administration teams to:

  1. Understand the type of information you’ll be requesting.
  2. Think of systemic, programmatic ways to retrieve this data leveraging APIs, automated reports, or other integrations.

Set your first CoE OKR

Make your initial objective simple and actionable. For example:

Leverage metrics in executive updates

Use the outcomes from this exercise to enhance your Observability CoE’s monthly updates with your executive champion. Highlight early wins, gaps, and actionable insights to build momentum and alignment.

Create achievable goals based on data

Once you’ve established baseline data, use it to define meaningful and attainable goals. For example:

Stay tuned for what’s next

In upcoming blogs, we’ll explore deeper aspects of creating a leading observability practice, including tools inventory, rationalization, and strategies for streamlining your observability ecosystem.

Observability resources, from experts

If you’re passionate about learning about observability, I’d encourage you to:

Series: Splunk for Observability Engineers

Related Articles

The Five Tenets of Observability
Observability
5 Minute Read

The Five Tenets of Observability

Observability is essential for technology success. Learn the five key tenets of an Observability system and the benefits of Observability for your company.
SignalFlows to SLOs
Observability
4 Minute Read

SignalFlows to SLOs

A short treatise on using SignalFlow to track Error Budgets (SLO) with Alert Minutes – or you can just use the linked Terraform files!
Application Performance Redefined: Meet the New SignalFx Microservices APM
Observability
6 Minute Read

Application Performance Redefined: Meet the New SignalFx Microservices APM

Splunk's newest release of the SignalFx Microservices APM introduces innovations like Full Fidelity tracing, AI-Driven Directed Troubleshooting, and open framework instrumentation