Status pages have become the end-users window into your team’s operations. Companies with status pages are doing the right thing for their users — building in some transparency while mitigating frustration and support contact.
For the benefits of status pages to pay off, organizations need to treat them as something more than active wiki-pages run by support. Let’s take a look at status pages, including:
- What they are
- Best practices
- Their transparency
What is a status page?
The status page basics are simple: You have a publicly accessible page that lists the state of your application services and regions, usually with the colors green, yellow and red. The primary purpose of the page is to let users know when there are issues, or if there’s something wrong on their end.
If your business offers more than one service, each individual service would likely have its own status page, showcasing the various regions it covers and its current availability status.
Status pages are simple in principle – but there is one main problem. More pages than not still favor technical users, and they’re often hard to find. This means, if the organization doesn’t encourage status page adoption, which many people see as a risk and not a benefit, then they’re not likely to be utilized or demonstrate value. For many companies, status pages are just for vanity.
Benefits of a status page
Many added benefits of a status page are as follows:
Mitigate support tickets
If you have a status page, users can visit this page whenever they suspect an issue or detect an incident. If they validate there’s an issue with one of the application services on the page then they’re less likely to submit a support ticket for impacted functionality. This helps both your employees and your customers:
- You’ll cut back on the flood of tickets that accompany outages.
- You promote self-service options that enable your customers.
Easily support social outlets
The first sign of trouble for application issues usually surfaces on social media. Your social team can leverage a status page as part of responses to people who report issues.
Historical status page data can be used to validate a company’s service level agreements (SLAs) and build trust with your product’s customers.
While status pages aren’t a complete troubleshooting tool when your support team gets a whiff of something going wrong, a status page could potentially act as an initial indicator. This can help support teams anticipate what they expect to receive from customers. A little bit of forewarning can help technical support engineers plan their outage communication and response protocol.
Build confidence with your end-users
Status pages are basically a requirement now. Without out, especially when you’re serving a technical audience that expects it, you’re not following the status quo of the industry. Technical users know things break — so they appreciate transparency from companies about:
- What broke, plus the when and why
- What’s being done to prevent it in the future
Status page best practices
The benefits are easy enough to understand. But, putting them into practice can be surprisingly hard. If you’re the type of organization that uses status pages as vanity so that you can say you have one, then nothing more can happen until that mentality goes away.
Organizations should be more afraid of a lack of transparency than too much visibility. In fact, organizations will likely see bottom-line benefit when implementing successful status pages. So, what makes a successful status page? Let’s look at the best practices.
Automate status pages
In the early days of status pages, the pages were updated manually. This made for low utility because, when something breaks, the last thing you can expect a support team to do is manually update a page.
- What is displayed
- When the page is updated
- Where the triggers are implemented
Give written updates as needed
A status page needs to have more than the status of the service. When there’s degradation or a full-fledged incident, there should be accompanying text to explain:
- What’s being done to resolve the incident
- Crucial customer-pertinent information about the issue, like whether customers should take any steps related to the service
Make the service breakdown relevant to users (not your org)
Vendors are often guilty of creating documentation and status pages that explain services based on how they interact with them. Which, for most organizations, is based on how the team is organized.
To the user, this categorization can be meaningless, at best, and confusing at worst. The services should be broken down from the user’s perspective, usually based on the individual components where they consume functionality.
Consider including some historic data
It isn’t that historical status page data is a requirement — it’s not. But, having historical context, not just current status, can help calm the nerves of people when something is wrong. Over the long-term, assuming end-users see more green than yellow or red, it gives the sense that incidents are not the norm. Otherwise, people could take a single service outage as a sign the entire application is flawed.
Include post-incident data
In addition to historical data, post-incident reports and updates are great for transparency. Additionally, more detailed information for critical (P1) outages, like in blog posts, articles or community portals, are hugely beneficial for visibility and building customer trust.
Tie into incident response strategy
Your status page is part of your incident response activity. When there are service issues or degradation, automation will alert those who are on-call and update the status page. When incidents are resolved, the status page will automatically update itself. And then, during post-incident reviews, status pages are an artifact for historical context.
Challenges with status pages (Handling transparency)
A good status page strategy will create positive impacts—but of course it costs time and money. The important thing to remember is that cost is negligible when compared to doing nothing. Let’s look at some common pushback to status pages and see why they don’t hold up.
When service status is green, no one will pay attention. When a service’s status is yellow or red, everyone will – including your competition. And, while they might try to use that asset against you, the response is easy, just highlight your commitment to the customer.
Some will use the status page as a tool to complain. Let’s be clear: if they don’t use your status page as a complaint tool, they will use something else far worse (Reddit, Twitter, etc.) The complaints that happen on social media, without a status page, are often more subjective, and that subjectivity may build long-term negative sentiment against your brand.
Status page transparency can possibly expose internal challenges with the dev and ops teams. For example:
- A status page can paint a picture of some systemic problems internally, where one service regularly has issues over time but others do not.
- Customers can see that issues clearly correlate with releases. This could be a sign that something needs to be fixed with the team’s development and release process.
Customer visibility has nothing to do with the severity of these long-term faults in your development team. Perhaps, being public will push the issue to address something that probably already should have been.
Status pages are not optional
Status pages can be a checklist item for most organizations. Those who leverage them to focus on the power of incident management and automation see huge gains in terms of customer satisfaction, transparency, and technical support cost-reduction.
What is Splunk?
This article was written by Chris Riley. Chris is a technologist and DevOps advocate who has spent more than a decade helping organizations transition from traditional development practices to a modern set of culture, processes and tooling.
This posting does not necessarily represent Splunk's position, strategies or opinion.