Splunk Observability at Cisco Live: Agentic Observability for the AI Era

Observability Cale Hilts

Key takeaways

  1. Splunk is enhancing incident resolution speed by introducing powerful AI-driven tools, including AI SRE and AI Canvas, to streamline how teams detect and investigate complex issues.
  2. The platform is now extending its observability reach across the entire AI stack, ensuring that organizations can monitor the health and reliability of AI agents, infrastructure, and Kubernetes workloads as they scale to production.
  3. By unifying a broader range of signals across logs, traces, and network performance, the updated Splunk Observability Cloud helps teams effectively connect technical system health to critical customer experiences and business outcomes.

Observability has always been about seeing clearly under pressure. But the pressure has changed.

Applications are more distributed. Kubernetes environments keep expanding. Digital experiences depend on services, APIs, networks, third-party providers, and now AI models and agents that can make decisions faster than a human team can review every signal. At the same time, operations teams are still being asked to reduce downtime, control cost, protect customer trust, and explain the business impact of every incident.

That is why this next chapter of observability needs more than another dashboard. It needs intelligence that can reason across signals, understand service context, help teams act faster, and make complex systems feel a little less like a storm cloud.

At Cisco Live, we’re announcing new Splunk Observability innovations designed for that shift: Agentic Observability. The goal is simple to say, harder to build, and critical for modern operations teams: fix and prevent issues with AI agents, observe the entire AI stack, and tie every signal to real business impact.

These announcements build on Splunk’s role as the intelligence layer for trusted agentic operations across the enterprise. That means moving from “something is broken” to “here’s what happened, why it matters, and what to do next.” Faster triage. Better context. Less guesswork.

Splunk Observability is helping close that gap with innovations across three connected areas:

AI-Powered Observability: Using AI SRE agents and AI-assisted workflows to detect, investigate, summarize, recommend, and help resolve issues faster.

Observability for AI: Monitoring AI agents, AI infrastructure, Kubernetes inference workloads, and critical AI environments so teams can understand how AI systems behave in production.

Unified Observability: Connecting metrics, events, logs, traces, network insights, application security signals, and business context so teams can see what matters and act with confidence.

Because the future of observability is not just more telemetry. It is telemetry with judgment. Context with action. And AI that helps teams move from response to resilience.

Here’s what’s new across Splunk Observability at Cisco Live.

AI-Powered Observability

AI SRE in Observability Cloud (GA June 2026) - Agentic AI for Detection, Troubleshooting, & Remediation

The rise of generative AI and large language models (LLMs) has fundamentally changed the observability game. We are moving away from passive monitoring—where you wait for a dashboard to turn red—to proactive, agentic observability. Learn how the AI SRE acts as your new teammate, automatically detecting issues, finding probable root cause, building a plan, and providing a step-by-step guide on how to get everything back up and running.

AI Canvas in Splunk (ITSI Capabilities - Alpha)

When an incident hits, can your team start investigating right away, or do they spend the first few minutes gathering context? With Cisco AI Canvas in Splunk IT Service Intelligence, teams can open an ITSI episode in AI Canvas and start with the full picture. Services, key performance indicators, entities, alerts, similar episodes, and linked tickets come through automatically, along with prompts shaped to that incident. No copy and paste. Less switching between tools. For ITSI users, AI Canvas speeds incident investigation and root cause analysis. For teams exploring AI Canvas, ITSI adds the service context, event correlation, and business impact needed to turn alert noise into action.

ITSI Event IQ Detect: User Feedback-Based Learning for Smarter Correlation, Shaped by Your Team

Event iQ Detect already uses machine learning to identify the right fields for alert correlation. Now it can learn from your analysts, too. When responders split an episode that grouped unrelated alerts, or merge episodes that should have been one incident, that feedback is captured directly in Episode Review and fed back into the Event iQ model during retraining. Over time, alert grouping gets more accurate without the usual manual tuning. Admins stay in control with retraining schedules, approval settings, and auto training options in policy configuration. The result is practical and immediate: clearer episode titles, better correlation summaries, fewer grouping mistakes, and event correlation that improves with use, even at high volume.

User feedback-based learning will be available in June ’26 as part of the ITSI 5.0 release.

ITSI Event IQ Diagnose: Resolve Incidents Faster With AI-Driven Next Steps Tailored to Every Root Cause

How many clicks does it take to understand an incident today? For many teams, it means bouncing through notable events, KPI trends, service health, and entity details before the picture comes into focus. Event iQ Diagnose changes that. It uses a large language model to generate a plain language summary of an ITSI episode, pulling together what happened, when it started, the key contributing events, the likely root cause, and recommended next steps in a single view.

The summary appears in the Impact tab of Episode Review and can be shared with teammates or sent into tools like ServiceNow. Event iQ Detect groups the alerts. Event iQ Diagnose explains the episode. That makes incident triage faster for Level 1 analysts, improves escalations during high priority incidents, and gives senior engineers more time to focus on resolution instead of reconstruction.

Event iQ Diagnose will be available in June ’26 as part of the ITSI 5.0 release.

AI Observability

Splunk Agent Observability

Cisco completed its acquisition of Galileo. Leveraging Galileo’s powerful out-of-the-box evaluations, real-time guardrails, and low-latency, cost effective evaluations, Splunk Agent Observability evaluates, observes, and ensures real-time protection of AI agents throughout their development lifecycle. Splunk Agent Observability supercharges the AI observability capabilities already available in Splunk Observability to minimize inaccuracies, block harmful outputs, and control costs. Read the blog to learn more.

Unified Observability

Observability Cloud Free Edition (GA)

The Splunk Observability Cloud Free Edition provides unrestricted access to enterprise-grade features without the pressure of traditional 14-day trial clocks or procurement cycles. By limiting the tier to 15 hosts rather than gating specific functionalities, Splunk empowers developers, startups, and AI engineers to build deep integrations and meaningful dashboards that remain consistent as their projects scale. Ultimately, this model transforms observability from a luxury into a foundational architectural element, allowing users to get deep insights from day one. Get Splunk Observability Cloud for free.

ThousandEyes Network Insights in Synthetic Monitoring (Alpha)

End-user experience is critical yet fragile, with potential failure points spanning the frontend, backend, and network layers. But when application and network telemetry remain siloed, teams often face longer MTTR and inefficient finger-pointing during troubleshooting.

We’re introducing ThousandEyes Network Insights in Synthetic Monitoring, a new integration between Synthetic Monitoring in Splunk Observability Cloud and ThousandEyes that helps teams monitor user experience across both application and network layers.

With this integration, teams can create and manage Splunk (app-layer) and ThousandEyes (network-layer) tests in one place, with global visibility from 1,000+ cloud agents across 271 cities in 69 countries. Teams can quickly isolate whether issues originate from the application or network layer, then drill into detailed application telemetry or pivot to deep network diagnostics in ThousandEyes, including DNS resolution, BGP routing, and hop-by-hop path analysis, to troubleshoot issues faster.

The integration will enter Alpha in July. The initial Alpha release primarily supports ThousandEyes HTTP tests, with additional test types to follow. Sign up here.

Runtime Application Attack Detection in Observability Cloud

AI is helping teams write software faster, but it is also creating more security flaws, giving attackers new opportunities to exploit vulnerable code. Splunk Secure Application adds security to existing observability tools. With Secure Application on Splunk Observability Cloud you can proactively prioritize application threats and vulnerabilities based on critical business context. That runtime application threat data is connected directly into Splunk Enterprise Security, unifying application and security teams, drastically reducing alert noise, and keeping critical services safer without slowing innovation.

Runtime application attack detection in Observability Cloud is now generally available. To learn more check out the blog, From Blind Spots to Active Defense: Securing Code in Mythos Era.

Log-Based Charting and Alerting in Observability Cloud (GA July)

By integrating logs-based charting and alerting directly into the Splunk Observability Cloud, organizations can eliminate the operational friction caused by fragmented tools and "swivel-chair" workflows that previously separated logs from metrics and traces. These features provide a unified observability experience, allowing teams to visualize log data and manage search-based alerts within a single, cohesive interface, which significantly reduces tool sprawl and alert fatigue. By leveraging existing SPL expertise and AI-driven assistance, customers gain deeper visibility and faster incident response times (MTTD/MTTR), enabling them to correlate MELT data more effectively while maintaining a streamlined, highly efficient operational environment.

Log-based charting and alerting in Splunk Observability Cloud is generally available. You can read more about it in our docs here.

What's Next

Observability is entering a new phase, where seeing what happened is only the starting point. The bigger opportunity is acting sooner, with clearer context, stronger guidance, and more confidence across every layer of the digital stack.

The latest Splunk Observability innovations bring that future closer: AI that helps detect, investigate, summarize, and recommend; visibility built for AI agents and infrastructure; and unified insights that connect technical signals to business impact. Less noise. Faster answers. Better digital experiences.

Start a free trial or book a demo today to see how Splunk Observability can help you move faster, troubleshoot smarter, and accelerate building a foundation for trusted agentic operations.

Many of the products and features described herein remain in varying stages of development and will be offered on a when-and-if-available basis. The delivery timeline of these products and features is subject to change at the sole discretion of Cisco, and Cisco will have no liability for delay in the delivery or failure to deliver any of the products or features set forth in this document.

Related Articles

Splunk Announces Participation in the Open Cybersecurity Schema Framework (OCSF) Project
Security
3 Minute Read

Splunk Announces Participation in the Open Cybersecurity Schema Framework (OCSF) Project

Announcing our participation as a co-founder of the new public Open Cybersecurity Schema Framework (OCSF) open-source project at Black Hat 2022.
Staff Picks for Splunk Security Reading March 2022
Security
2 Minute Read

Staff Picks for Splunk Security Reading March 2022

Check out our Splunk security experts' curated list of presentations, white papers, and customer case studies that we feel are worth a read in the month of March.
Presidential Executive Order: “Collect and Preserve” Incident Data. Is this the Catalyst for Cybersecurity’s Black Box?
Security
3 Minute Read

Presidential Executive Order: “Collect and Preserve” Incident Data. Is this the Catalyst for Cybersecurity’s Black Box?

President Biden’s Executive Order (EO) on Improving the Nation’s Cybersecurity defines a solid path forward for the Federal government and its suppliers to address systemic problems in defending cyberspace.