Scalability in IT: The Complete Guide To Scaling

Learn March 12, 2024 Joseph Nduhiu

Somewhere in the IT multiverse, a perfect balance has been achieved between demand for IT services and installed system capacity.

Unfortunately, that isn’t our world.

IT systems operate in swing periods of idle capacity and overloads, as the ebb and flow of demand is influenced by various internal and external factors. For example, peak periods such as Black Friday and Cyber Monday can cause a significant strain on computing resources required to support global e-commerce shoppers looking for the best deals.

Statistics from Cloudflare in 2023 showed a 27% increase in traffic through their network from the previous year on these days. As enterprises implement their digital transformation strategies and develop new products, the growth in transactions and data requires IT resources that are able to handle the increase — without impacting performance or user experience.

Chart depicting the daily HTTP Requests for Cloudflare year 2023

Daily HTTP Requests for Cloudflare, 2023

What does scalability mean?

Gartner defines scalability as:

“The measure of a system’s ability to increase or decrease in performance and cost in response to changes in application and system processing demands.”

In the technology space, scalability is one of the main selling points of migrating to the cloud versus maintaining on-premise data centers. An organization that acquires cloud services is given a promise of accessible resources that:

Can be ordered and provisioned over a short time period to address growing information processing needs.
Can also be released when the organization does not require them.

This flexibility means that enterprises do not worry too much about tying their hard-earned capital in IT infrastructure and systems that may not match fluctuating demand.

Scalability vs. elasticity: same or different?

Sometimes the terms scalability and elasticity are often used interchangeably. But are they really the same thing?

Of five essential characteristics of the cloud computing model defined by NIST, one is rapid elasticity. This is where capabilities can be elastically provisioned or released to scale rapidly outward and inward commensurate to demand. The general agreement is this:

Scalability is viewed from a load handling perspective.
Elasticity is considered as the description of the speed of response to demand.

Indeed, the AWS glossary defines scaling as the outward or inward change in size, configuration of make up of a logical group of compute instances.

Approaches to Scalability

There are two main approaches that are used in describing scaling in a cloud computing environment: vertical scaling and horizontal scaling.

Vertical scaling

Vertical Scaling (scaling up) involves the upgrading of the resources of the existing virtual machines to cater for increased demand. Components that can be upgraded include:

CPU
Memory
Storage
Network throughput

Examples include virtual machines and compute resources which can be resized to accommodate performance requirements.

Horizontal scaling

Horizontal Scaling (scaling out) involves increasing the number of computing instances in a logical pool i.e. replication due to increased demand. Examples of horizontal scaling include:

Load balancers, which distribute traffic across multiple instances.
Kubernetes, which orchestrates containers.

Choosing the right scalability approach

The decision on what approach to take is mainly driven from the application architecture, as applications that can be easily distributed across multiple servers (such as stateless microservices) are more likely to be catered for by horizontal scaling. Other parameters include:

Traffic demand
Costs consideration
Resource efficiency
Performance requirements

From an uptime perspective, we can say this:

Horizontal scaling is more suitable as it does not require taking an existing server offline for upgrades.
In contrast, where resource intensity is key, then vertical scaling becomes the more preferrable approach.

(Learn all about load balancing for microservices.)

Diagonal scaling

Combining the two approaches results in a third hybrid model, i.e. diagonal scaling. This starts as vertical scaling, but once the resources are capped, then horizontal scaling kicks in.

This approach is deemed to be good for organizations who face unpredictable demand — hence the need to be able to respond in an agile and flexible way without restriction. However, it is obviously costlier and has higher operational complexity compared with the previously mentioned approaches.

Automation of scaling

Automating of scaling is usually the preferred approach for horizontal scaling. That’s because it does not involve disruption of services running on existing instances.

Autoscaling adds virtual machines to a group of instances and deletes them based on traffic as well as other configured parameters. For example, on Google cloud, the autoscaling parameters that come into play include:

CPU utilization: The percentage load that the CPU is handling over a time period.
Throughput: The limit of requests per second that can be handled effectively.
Latency: How long a request stays on a queue before being processed.
Instance count: The number of minimum and maximum instances in a logical group.

Database scalability

For databases, there are two main approaches to scaling:

Replication involves creating of copies of the database, where the copies are a replica of the original (primary), and data is synchronized across all copies starting from the primary.
Partitioning/sharding involves two parts: dividing the database into multiple parts, and distributing data based on an agreed strategy. This approach introduces more complexity and overhead in managing data that is spread across a cluster.

(Related reading: distributed systems and distributed tracing.)

Benefits of scalability

The main benefit of scalability is assurance: you want to assure your business of the reliability of IT services you’re delivering, both to internal stakeholders and end-users, customers, and prospects.

By planning the right capacity to address demand and performance requirements, and being able to respond smoothly to changes in traffic, the quality of IT services offered to the organization remains in line with expectation leading to improved customer satisfaction.

Whenever incidents occur, scalability supports high availability as instances are spun up quickly with similar configurations to handle the service requirements. This is a form of self-healing: new instances are created that are not affected by any disruption affecting existing instances.

Other benefits include:

Cost effectiveness. The organization does not need to tie down capital in investing in infrastructure that is not utilized. Scaling ensures that demand is responded to with just the right amount of capacity, that can be quickly reduced when the demand dissipates.
Disaster recovery. Where horizontal scaling is spread across geographically distributed zones, the probability of downtime totally crippling an IT service is reduced.

Scalability practices

Even where scaling is automated, do not assume that configuring scaling is a one-time set-and-forget activity. IT and system administrators must constantly monitor and analyze traffic trends and end-to-end application performance metrics in order to select the most optimal scaling metrics for their systems.

Observability tools

The right metrics will depend on a situational basis — which is precisely why you need to constantly review and optimize the scaling configurations.

Investing in observability tools is a wise option. By aggregating metrics and logs, alongside additional data, these tools can predict potential bottlenecks or failures that can impact application performance and therefore require optimization of scaling parameters.

/en_us/blog/fragments/observability-cloud

Serverless computing

Some organizations have chosen to outsource the scaling headache to cloud service providers by adopting serverless computing. Applications built on serverless infrastructure have the benefit of automatic scaling, since the backend is fully managed to handle whatever traffic is generated from user transactions.

But beware: serverless alone is not a magic bullet to addressing scaling challenges, as the wrong application design could lead to certain functionality not scaling in tandem, thus causing bottlenecks. So, admins must regularly monitor application performance against the set limits, and initiate optimization when required.

/en_us/blog/fragments/disclaimer-with-divider

Style

two-column

Data Exfiltration: Prevention, Risks & Best Practices

Learn

7 Minute Read

Data Exfiltration: Prevention, Risks & Best Practices

Learn about data exfiltration as a threat to organizations. Discover techniques, risks, and prevention measures to safeguard sensitive information.

Common Event Format (CEF): An Introduction

Learn

2 Minute Read

Common Event Format (CEF): An Introduction

In this blog post, we'll take a look at common event format (CEF) s a standard for the interoperability of event- or log generating devices and applications.

Learn

5 Minute Read

The Data Engineer Role, Explained

Whether managing, cleaning, or structuring data — data engineers are an incredibly important role in tech. Dive into their responsibilities and business impacts here

/en_us/blog/fragments/about-splunk

/en_us/blog/fragments/subscribe-footer

Scalability in IT: The Complete Guide To Scaling

What does scalability mean?

Scalability vs. elasticity: same or different?

Approaches to Scalability

Vertical scaling

Horizontal scaling

Choosing the right scalability approach

Diagonal scaling

Automation of scaling

Database scalability

Benefits of scalability

Scalability practices

Observability tools

Serverless computing

Related Articles

Data Exfiltration: Prevention, Risks & Best Practices

Common Event Format (CEF): An Introduction

The Data Engineer Role, Explained