The Observability Center of Excellence, Part II: From Pitch to Formation
Welcome back to our Observability Center of Excellence (CoE) series! In this second article, we’ll dive into the steps to help you form your very own Observability CoE team.
(Haven’t had a chance to check out the first blog? All good... Please take a quick look at "An introduction to the Observability CoE," as I will reference its content throughout this article.)
TLDR recap: The Observability CoE positions you to build a leading observability practice, addressing the challenges and impacts of legacy monitoring practices of the past. This table summarizes what the Observability CoE helps with:
- Tools fragmentation
- Low confidence in alerting systems
- Reactive monitoring, reactive response
- Focus on tech domain, not service health
- No measurement framework
- Increased costs
- Reputation damage
- Prolonged outage calls
- Tools sprawl and visibility inconsistencies
- Business & IT blind spots
Still with me? Great! Now that you're on board, it's time to generate some excitement within your organization and build your CoE team. In this article, I'll give you the tools to:
- Spark that enthusiasm.
- Secure executive buy-in.
- Start recruiting the right people to make this happen.
Let's get started!
Your internal pitch for a CoE
Here’s a quick rundown of what’s coming next: We’re about to dive into some key concepts you’ll need to grasp in order to get buy-in and start building your Observability CoE.
At the end of this process, you’ll have a deck ready to deliver to the decision makers to make sure that you can get what you need to build out a CoE.
Introduction to the deck To kickstart the process, I’ve included a Sample Presentation that covers each concept we’ll discuss. Feel free to apply your favorite theme, make tweaks, and tailor it to align with your organization's vibe.
Before pitching and securing buy-in for your Observability CoE, make sure you first understand the key concepts covered in this article, as they’re what I used to build the deck. These are the building blocks for gaining executive support and forming your team. As you review the content, remember to:
- Focus on how each section aligns with your organization’s specific goals or pain points.
- Relate the information to your business needs. This will help you present in a way that resonates with leadership, aligns with strategic initiatives, and sets a strong foundation for your CoE.
Breaking down the pitch: Key sections to include
Introduction of the Observability CoE
We’ve already covered these challenges in depth in the previous blog post, "An Introduction to the Observability CoE" and the TLDR in the intro above.
It’s important that you understand the common challenges organizations face with legacy monitoring practices, the resulting impacts on business operations, and how the Observability CoE addresses these issues.
Better yet, think about examples where some of these challenges and impacts resonate with your business. For example:
- Do you trust the alerts in your environment? How many alert emails hit your email inbox a day?
- Have you had instances where customers reported issues before your team even knew about them?
Tell stories about specific problems that your business has suffered because of insufficient observability. Use data on the customer and employee impact, if you have it.
By asking these questions and connecting them to your organization’s specific pain points, you make the need clear right away before you start to ask for resources.
Demystifying observability (as needed)
Observability can feel like an overloaded and misunderstood IT buzzword, but it’s essential to level-set and break it down into bite-sized, vendor-agnostic terms.
In this section, you want to clear the air and any misunderstandings. Give your organization a balanced understanding of observability — one that will resonate with both executives and the technical leaders who will form your CoE. Here’s a few topics you may want to run by them:
- Observability vs. monitoring: Monitoring tells you when something is wrong, but observability gives you the context needed to understand why things are breaking down, enabling faster resolution and proactive insights.
- Pillars of observability (MELT): The core pillars — metrics, events, logs, and traces — are the building blocks of any observability practice. Each provides a different perspective into your infrastructure, applications, and business services, helping to paint the complete picture.
- Critical capabilities: These were already covered in depth in the first blog, including infrastructure monitoring, APM, DEM, centralized log management, AIOps, and event management. Each capability plays a role in ensuring full-stack visibility.
- The observability practice: The organization, utilization, and ultimate value of these capabilities form your organization's observability practice. The Observability CoE should be positioned to ensure that these pieces are optimized, standardized, and governed — ensuring a structured approach that drives business value.
The dream team: Roll call
Next, let’s talk about the key people who need to be involved in your Observability CoE. Below is an overview of the various roles to consider, along with common titles and some "pro-tips" for each role as you build the CoE.
Roles to include in the CoE. As you assemble this team, remember to keep it lean (two pizza team), focused, and agile, with just enough members to accomplish your objectives. This team should include a strong mix of IT resources, such as:
- Support, IT operations, SRE/DevOps, application development
- Those responsible for critical IT processes
Keep in mind, team composition may vary depending on the type of organization you’re in. For example, a shared services model, business unit IT, or more advanced SRE setups may require different mixes of skill sets and roles within the CoE.
Balancing existing work with CoE. It’s important to strike a balance between CoE work and their existing responsibilities. One way to achieve this is by attaching to in-flight and high-priority initiatives already underway in the organization.
- Early on, as you form your CoE, the focus will be on clarifying priorities, setting scope, and ensuring that observability inputs/outputs are positioned to be ingrained in related and dependent processes.
- Over time, the CoE will transition into delivering iterative value, driving your observability strategy forward while balancing the team’s day-to-day roles.
Here’s a table summarizing who should be involved and their responsibilities:
Sample CoE focus areas & key initiatives
Once your Observability CoE gets rolling, the value it can bring to your organization is practically limitless. Below are some sample initiatives the CoE might take on, all tied to important areas: governance, standards, tools, business alignment, and measurable impact.
The goal of this section is to give you practical examples of things that both your executive sponsor and CoE team members could agree are currently missing and need to be fixed. This is the art of the possible, laying out what can be achieved through focused CoE initiatives.
These examples are just the start. Grab the ones that resonate with your team, or better yet, make this an early win for the CoE: review the options and pick what fits your organization. And stay tuned, because in future posts, we’ll dive into the nuts and bolts of how to execute on some of these core use cases. Be sure to ask those you’re pitching to whether or not they can think of any additional initiatives.
Some of these may sound buzzwordy. Make sure you use the right language for your audience and your company. If you want to learn more about these topics, I’ll be writing about them in greater detail later.
- Develop repeatable standards and baselines
- Observability as-service request and fulfillment offerings
- Develop a repeatable observability framework for new and legacy workloads
- Create a tiered observability offering
- Identify Observability as a Service offering
- Connect observability with critical IT initiatives
- Leverage observability vendors
- Educating the organization on observability
- Shifting observability left in the SDLC
- Measure observability KPIs: agent saturation, alert to incident ratio, tools license utilization, etc.
- Assess organizational observability maturity
- Create and share a quarterly “State of Observability” business report
- Tools utilization/saturation
- Tools audit
- Driving tool adoption
- Reducing tool sprawl
- Champion observability solution POC’s
Meeting format and cadence
Getting your Observability CoE off the ground requires commitment, but the goal is not to overload your team with additional meetings. Start small, find your rhythm, and gradually increase as the CoE begins to deliver real value.
Begin with weekly or bi-weekly meetings, keeping them around 30–45 minutes. These should focus on the essentials:
- What has been completed
- What’s next
- Anything blocking progress
A key task in the early meetings will be prioritizing the focus areas and/or initiatives (examples above) the team will tackle first. Given the list of objectives or tasks (e.g., tools audit, synthetic monitoring for critical workflows), it’s important to determine which actions deliver the most immediate value and align closely with existing business goals. This helps ensure that the CoE doesn't feel like an extra task on top of an already busy IT schedule.
As the CoE progresses, stay closely connected with your executive sponsor. This regular communication ir crucial, as it helps maintain alignment with business priorities and keeps leadership engaged. Make sure to:
- Share your quarterly reviews with your sponsor, to ensure CoE outputs continue to be impactful and relevant.
- Celebrate wins and showcase progress.
- Adjust where necessary.
Remember, the goal is to let the CoE grow at a pace that delivers results without overwhelming the team. Start small, prioritize effectively, and stay connected to leadership. (I've included a sample CoE meeting agenda and a quarterly update slide in the sample deck for reference.)
Wrapping up: Next steps and your call to action
By now, you’ve seen just how important an Observability CoE is for your organization. The next step? Generating excitement and securing buy-in, from the executive sponsor and future CoE members. Use this blog and the reference deck as your guide to spark that momentum.
Take the deck, update it with your organization’s specific context, and grab some time on your proposed sponsor’s calendar. Secure buy-in, form your team, and start getting those meetings on the books to set your CoE in motion.
This is just the beginning. We’ll be rolling out additional content soon, with more guidance on specific CoE initiatives and how to make them a reality in your organization. Check back here for more.
Observability resources, from experts
If you’re passionate about learning about observability, I’d encourage you to:
- Check out our team's observability articles and tutorials on Splunk Community.
- Watch our Splunk Observability for Engineers video series. Check out the entire series for more tutorials, insights, and new features and capabilities.