Introducing New Features to Cut Alert Fatigue for Cloud Operations

We are excited to announce the latest release of SignalFx’s alerting capabilities: Alert Preview, Built-in Alert Conditions, and Alert Functions Library through the SignalFx API. These new capabilities empower cloud operations teams to better monitor and manage cloud infrastructure, containers, and applications. You can read the press release here.

Built on SignalFx’s industry leading streaming analytics technology for time-series metrics, these new capabilities address the challenges of monitoring and alerting in cloud environments:

  • Cloud operations teams are overwhelmed with the data points generated from the 100s to 1000s of web services. Understanding what is normal in these environments is difficult and therefore wrong alert conditions are set up and false positives are fired.
  • Determining the best alert conditions in these environments, and the impact new alert conditions will have on the operations teams, is a complex process. Teams often lack the context to know whether there is a real issue and become increasingly fatigued with alert storms.

Alert Preview

Setting alert logic is often an extensive process of trial and error. Even with a monitoring solution flexible to support more complicated alert logic, the dynamic environments of cloud applications can complicate alerting.

Now with Alert Preview, cloud operations professionals have the ability to test and preview the results of alert conditions on recent data. Users will be able to see the frequency and efficacy of alerts before applying them to live, real-time data streams, removing the cost and confusion of unnecessarily firing alerts.

Read more about Alert Preview here.

Built-in Alert Conditions

Normal behavior of a cloud application is complicated and knowing how to alert for abnormal behavior is even more complicated. Built-in Alert Conditions gives users the ability to immediately capture anomalies in cloud applications by pre-packaging alerting algorithms for the most common problem scenarios. Examples of the most common problem scenarios of where these conditions can be used include:

Outlier Detection

Sudden Change

Historical Anomaly

The number of logins in the last 10 minutes for this instance is 3 standard deviations lower than other instances in the same AWS availability zone. All the values for cpu.utilization received in the last 15 minutes are at least 3 standard deviations above the mean of the previous hour. The average number of business transactions in the last 2 hours is 30% lower than the average for this same two hours last week.

The settings of each Built-in Alert Condition is exposed to customize settings to your alerting requirements. Rapidly experiment with different parameters and leverage Alert Preview to immediately reveal how an alert would have behaved. Save time when setting up monitoring and delivering alerts that reflect the reality of operation a cloud application.

Read more about Built-in Alert Conditions here.

Alert Functions Library Through the SignalFx API

The sophistication and complexity of cloud applications require operations teams to develop custom alerts that reflect the needs of their environments. With the SignalFx API, cloud operations have direct access to a library of alert functions.

Customers have always been able to build their own detectors through the SignalFx UI, and now they can express even more advanced alert logic and deliver deeper operational intelligence across their organization. SignalFx is the first solution to retain the flexibility and depth needed by power users while approachable for everyday users to make monitoring the basis of collaboration at every point of the process of building, deploying, and operating a cloud app.

Read more about Alert Functions Library here.

Empower Cloud Operations

These alerting capabilities from SignalFx remove complexity and maximize the productivity of the cloud operations team with powerful new tools that expedite the creation, deployment, and tuning of alerts using machine learning algorithms that dynamically adapt to changing environmental conditions.

Now cloud operations professionals are empowered to:

  • Radically reduce the iteration cycles to create, deploy, and tune useful alerts
  • Dramatically compress the time to value by leveraging operational expertise in cloud ready alert conditions
  • Eliminate false positives and alert storms with direct access to a rich library of alert conditions

Paul Ross

Posted by