RED Metrics & Monitoring: Using Rate, Errors, and Duration

Key Takeaways

  • RED monitoring focuses on three key metrics: Rate (number of requests), Errors (failed requests), and Duration (latency of requests). These metrics provide a clear and actionable view of system performance in real-time.
  • It’s ideal for monitoring microservices: RED monitoring is designed for modern, distributed systems, helping teams quickly detect issues, optimize performance, and maintain reliability in cloud-native architectures.
  • RED monitoring simplifies troubleshooting and scaling: By tracking the health of services through these focused metrics, teams can prioritize improvements, reduce downtime, and ensure a better user experience.

The RED method is a streamlined approach for monitoring microservices and other request-driven applications, focusing on three critical metrics: Rate, Errors, and Duration.

Originating from the principles established by Google's "Four Golden Signals," the RED monitoring framework offers a pragmatic and user-centric perspective on service assurance and service performance.

What is the RED method of monitoring?

The RED method is a framework for instrumenting and monitoring microservices. The RED method contrasts to the USE method of instrumenting, which typically applies to hardware, infrastructure, network disks, etc.

The RED

RED metrics: what to instrument, what to monitor

The RED monitoring method is tailored to enhance end-user satisfaction, focusing on these 3 metrics. For every resource, monitor RED metrics:

Rate

Rate tracks the number and, in certain contexts, the size of requests, such as photo uploads in a photo hosting service. Monitoring rate is crucial, especially in environments susceptible to peak traffic failures, noting that both spikes and drops in requests are significant.

Errors

Errors counts the number of failed requests per second. Error rates provide insights into the reliability and quality of the service. Errors represent any issues leading to incomplete or incorrect results, necessitating immediate resolution.

Duration

Duration records the time taken for each request. This aspect is crucial for assessing the service's responsiveness and efficiency.

Duration metrics, capturing the time of requests, are vital for establishing the sequence of events, particularly in complex microservices environments. This aspect is crucial for both client-side and server-side interactions.

In applications involving multiple services, pinpointing issues requires understanding...

Duration generally falls into the realm of distributed tracing, like OpenTracing and OpenTelemetry. Distributed tracing tracks the path and time your requests take between and within services, and brings events into causal order.

Tracking RED for infrastructure

The RED method's effectiveness in its ability to track these aspects, aiding in identifying and resolving service or infrastructure-related problems. By giving us a solid, standardized starting point, RED makes it possible for separate teams to exchange clear information on concerns within the system, yet allows for expansion to cover unique needs and powers the drill down needed for cause analysis.

Learn more about RED monitoring in this presentation from .conf 2021.

Benefits & Limitations of RED Monitoring

So, what can RED do for you? Besides being an easy to remember acronym, RED tends to reduce decision fatigue in deciding how to get started observing your microservices applications. Its simplicity and clarity make the learning curve short. And it gives all of the teams, both operational and development, a common vocabulary to discuss issues and resolutions.

RED can be extended to build specifics for your unique needs based on your unique usage. And by tracking the path, duration and success of their requests, RED can serve as a proxy for user happiness.

Limitations

Monitoring microservices

The RED Method represents a focused and effective strategy for monitoring microservices and other request-driven applications, ensuring that key performance indicators align with user experience and service reliability. Its simplicity and effectiveness make it a valuable tool for modern software architectures where user satisfaction is paramount.

Related Articles

How to Use LLMs for Log File Analysis: Examples, Workflows, and Best Practices
Learn
7 Minute Read

How to Use LLMs for Log File Analysis: Examples, Workflows, and Best Practices

Learn how to use LLMs for log file analysis, from parsing unstructured logs to detecting anomalies, summarizing incidents, and accelerating root cause analysis.
Beyond Deepfakes: Why Digital Provenance is Critical Now
Learn
5 Minute Read

Beyond Deepfakes: Why Digital Provenance is Critical Now

Combat AI misinformation with digital provenance. Learn how this essential concept tracks digital asset lifecycles, ensuring content authenticity.
The Best IT/Tech Conferences & Events of 2026
Learn
5 Minute Read

The Best IT/Tech Conferences & Events of 2026

Discover the top IT and tech conferences of 2026! Network, learn about the latest trends, and connect with industry leaders at must-attend events worldwide.
The Best Artificial Intelligence Conferences & Events of 2026
Learn
4 Minute Read

The Best Artificial Intelligence Conferences & Events of 2026

Discover the top AI and machine learning conferences of 2026, featuring global events, expert speakers, and networking opportunities to advance your AI knowledge and career.
The Best Blockchain & Crypto Conferences in 2026
Learn
5 Minute Read

The Best Blockchain & Crypto Conferences in 2026

Explore the top blockchain and crypto conferences of 2026 for insights, networking, and the latest trends in Web3, DeFi, NFTs, and digital assets worldwide.
Log Analytics: How To Turn Log Data into Actionable Insights
Learn
11 Minute Read

Log Analytics: How To Turn Log Data into Actionable Insights

Breaking news: Log data can provide a ton of value, if you know how to do it right. Read on to get everything you need to know to maximize value from logs.
The Best Security Conferences & Events 2026
Learn
6 Minute Read

The Best Security Conferences & Events 2026

Discover the top security conferences and events for 2026 to network, learn the latest trends, and stay ahead in cybersecurity — virtual and in-person options included.
Top Ransomware Attack Types in 2026 and How to Defend
Learn
9 Minute Read

Top Ransomware Attack Types in 2026 and How to Defend

Learn about ransomware and its various attack types. Take a look at ransomware examples and statistics and learn how you can stop attacks.
How to Build an AI First Organization: Strategy, Culture, and Governance
Learn
6 Minute Read

How to Build an AI First Organization: Strategy, Culture, and Governance

Adopting an AI First approach transforms organizations by embedding intelligence into strategy, operations, and culture for lasting innovation and agility.