By Billy Hoffman
There are many different classes of web performance tools, from synthetic monitoring to application performance monitoring (APM), to real user monitoring (RUM), and more. These different classes exist because each has its own strengths and weaknesses.
When evaluating open source tools or enterprise-grade synthetic monitoring tools, you want to look for capabilities that maximize its strengths. With over a decade of experience in the web performance space, we understand that the needs and capabilities have evolved, and when evaluating solutions it can be tough to distinguish between the ‘nice-to-have’ and the ‘must-have’ features that will enable you to gain the most value out of your synthetic monitoring solution.
Here is a checklist to help get you started:
1. Multi-Step User Flows & Business Transactions
One of the key benefits of synthetic monitoring is that you can define the specific actions of a test, allowing you to walk through key flows of your application like a checkout flow or a sign-up flow, to verify its functionality and performance. This is called scripting, and a tool’s scripting capabilities directly determine how valuable it can be.
Here are some of the scripting capabilities that are essential to look for:
- How do you record scripts?
- Is there a browser-based recorder?
- Do you need to manually write code or steps?
- Can you manually edit recorded tests or do you need to completely re-record?
- What is the level of technical knowledge would someone need to be able to record a script?
- Does the tool allow you to import industry-standard formats like Selenium IDE?
Of course, since websites change daily (in some cases) and scripts can stop working, it is also important to evaluate a tool based on its troubleshooting capabilities, such as:
- Can you test the script?
- Will it tell you what steps it fails on and why?
- Does it capture screenshots or a video of script running, so you can see what the screen looked like when a button or form field couldn’t be found?
- Can you export the script in an industry-standard format so you can troubleshoot it somewhere else if need be?
As an example, here is what the industry-standard Selenium IDE recorder looks like when testing a mission-critical “checkout user flow”.
2. Measure and Compare Performance Experiments
A big advantage of synthetic tools is they allow you to experiment with different “What If” scenarios and see the impact on performance. It is essential to ensure you have flexibility and options that make it easy to get a clear line of sight of the impact of your performance initiatives. Some common examples include:
- Testing your site with and without a CDN, or with different CDNs.
- Excluding specific third parties and seeing the impact on performance.
- Comparing mobile performance to desktop, or even different mobile devices
- Determining the impact of Single Points of Failure.
- Measuring the performance of different A/B test groups.
- Testing new features that are not enabled for all users.
- Using different global locations to see how geography exacerbates any performance slowdowns.
How completely a specific synthetic tool can do these things depends on how much you can control about a test. Here are some of the configuration options to look for to be able to assess the results of common web performance experiments:
- Can you configure the test to exclude requests for certain domains or URLs?
- Can you overload DNS or hostnames to point at different IP addresses?
- Can you pre-load specific cookies for a test?
- Can you test with 3G, 4G, or different networking connections?
- Can you specify custom HTTP headers to add to requests?
- Can you specify the device, viewport, or user-agent used?
- Can you specify the location? How precisely can I define the location? (Am I testing from “Canada”, or am I testing from “Vancouver, in British Columbia, in Canada?”)
Of course, configuring a test with different options is only half the battle. In all of these scenarios, you will collect performance data about your sites and applications under different conditions, and then you will need to compare them. How your synthetic solution allows you to compare data and visualize differences is critical since that determines how easily and quickly you get results. Here are a few ‘must-haves’ to look for:
- Can you compare 2 specific measurements?
- Can you see both absolute and relative improvement? (Visually Complete got better by 400ms, which is 24%)
- Can you compare the videos or waterfalls side-by-side?
- Can you collect multiple samples of each configuration, and compare those on a graph?
For example, here is what a comparison report looks like in Splunk Synthetic Monitoring.
3. Robust Alerting Capabilities & Integrations
Synthetic monitoring is one of the best ways to detect outages or availability issues since it actively tests your site from the outside. Critical to this is what capabilities does the tool have to define an outage and send the notification.
Here are a few things to look for:
- Can you run it from different locations inside the major geographic regions where the majority of my visitors are?
- What is the testing frequency?
- Can you verify the presence or absence of text?
- Can you verify with the response code?
- Will connection issues (DNS, TCP, etc) trigger an outage?
- Can SSL certificate issues trigger an outage?
Just because you had trouble accessing the site once, doesn’t necessarily mean there is an outage, or tell you if that outage is regional or global in scope. False positives can lead to alert fatigue so here are some of the more advanced capabilities to look for as well:
- Does the tool show you a screenshot or the source code of where the error was?
- Will the tool automatically retry a test that fails to verify the outage?
- Can you test from multiple locations?
- Does the tool use results for multiple locations to automatically detect regional vs global outages?
For example, here is a real screenshot (displayed within Splunk Synthetic Monitoring) showing off a page which returned an error:
Once your synthetic monitoring solution has detected an outage, it is ESSENTIAL that it notifies you and your team. How you want a tool to notify you depends on the workflow of your team.
At a minimum, you want a solution that can notify via email or SMS. These very basic options ensure the tool can work with any team. Beyond that, you should focus on notification options that can easily integrate as tightly as possible into your team’s workflow and style.
This will optimize how quickly your team can see and react to an outage.
Here are a few options to look for:
- Mobile teams in multiple countries? You may need a tool that sends push notifications.
- Decentralized team? Can it send messages to teams or channels inside chat applications (Slack, Microsoft Teams, etc)?
- Are you already using an operations tool like OpsGenie, Service Now, or PagerDuty? Look for a tool that comes with turnkey integrations.
- Does the solution have a generic “Send a Webhook” alert option? This is a great catch-all that allows you to connect the tool to the rest of your process, regardless of how the tools that make up the process evolve.
Here is what a typical custom webhook looks like. You should make sure the synthetic tool you choose has similar functionality:
4. Pre-production Testing
One of the key strengths of synthetic monitoring solutions is that they can help you assess the performance and user experience (UX) of a site without requiring large volumes of real users driving traffic, a known weakness RUM or APM solutions. This means synthetic monitoring tools can be used in pre-production and lower environments (staging, UAT, QA, etc) allowing you to understand the performance of your site while it’s still in development.
This is tremendously powerful since it allows you to use performance as a gate and stops performance regressions over time.
To do this, your solution must be able to reach your lower environments (like staging or UAT) and gather performance data. It also must deal with some configuration nuances that are unique to testing environments.
Here are capabilities for accessing pre-production that you should look for:
- Is the testing location outside of your environment? Will you need to whitelist IP addresses? How much work is involved with your security team?
- Can you ignore SSL certificate errors (because it uses an internal Certificate Authority or a self-signed certificate)?
- Can you install something inside of your environment to run tests? Do you need to open ports for it to work properly?
- Is it a physical system or virtual?
- If virtualized, what is the technology used (VM, Docker, software package)?
- What resources are needed to run the internal testing location?
- How many tests can you run simultaneously through it, and how can you increase that?
As an example, Splunk Synthetic Monitoring provides copy and paste instructions to launch a Docker instance to test pre-production sites:
5. Competitive & Industry Benchmarking
With a Synthetic product, benchmarking a competitor’s site is as easy as testing your own site…you simply provide a URL. And BOOM, you’re done!
However, there are various web security products that sit in front of websites and can block traffic from synthetic testing tools as a byproduct of trying to block attackers, bots, and other things that can cause fraud. You may often find that the IP addresses of the cloud providers and data centers used by synthetic providers are blocked.
So one thing to consider in your synthetic monitoring solution is:
- Can you run tests from locations that your competitors are not blocking?
Another reason security products used by your competitors can block synthetic tests are because of the User Agent. If the User Agent is different than what an actual browser uses, that can cause you to be blocked.
So another capability to check for is:
- Can you customize the User Agent to remove anything that identifies it as a synthetic testing tool?
Once you are able to collect performance and user experience data from a competitor, you have everything you need to compare those results to your own site, so it is important to further understand:
- Can you easily compare multiple competitors in a single view?
- Is there an easy-to-understand way to show less technical people internally which site is “Best” and why?
- Does the tool offer a composite score like the Google Lighthouse performance score, which makes it easier to compare one site to another?
As an example, many of Splunk Synthetic Monitoring’s customers use this Competitor Dashboards to easily how they compare to their competitors:
When evaluating a synthetic web performance product, there are ‘nice-to-have’ and ‘must-have’ capabilities that will allow you to extract the most value out of your solution. As you shop around and evaluate tools, remember the use cases where synthetic testing excels:
- Measuring the experience of mission-critical business flows
- Running controlled experiments and tests to measure the impact of different features (3PC, CDNs, new features, etc)
- Detecting and alerting on outages, seamlessly
- Validating performance and user experience (UX) standards in pre-production and lower environments
- Benchmarking your performance against your competitors and industries
In order to extract the most value of your synthetic solution, you will want to keep the following questions in mind:
- How easy is it to configure tests and how flexible are the controls for different experiments?
- How easy is it to integrate into your existing alerting workflow?
- How easy is it to integrate into pre-production testing environments?
- How is it to visualize and compare different tests and trends over time?
At Splunk we take pride in the fact that our front-end synthetic monitoring solutions were designed with the user experience in mind. Interested in how Splunk Synthetic Monitoring delivers in the critical areas above?
Have you answered YES to any of these?
- Is your engineering team reactive to performance/UX issues?
- Do you have multiple places to look in order to identify where the issue is stemming from?
- Do you wish you could integrate performance into your CI/CD workflow in order to get ahead of issues before they go live into prod?
If so, you’re losing valuable time and preventing the team from building and innovating.Splunk Synthetic Monitoring monitors performance/UX on the client-side AND tells you how to improve + make optimizations. You can even integrate this practice into your CI/CD workflows. This allows you to automate a lot of the super manual tasks around performance and also helps operationalize performance across the business.I want to show you what that could look like. Can you set aside 10 minutes for us and we’ll make the most of your time — not a pitch, just a demo?