Back to Lingo

Autoscaling

Autoscaling is the ability of a system to automatically adjust the number of running resources based on demand. In the cloud, this usually means adding or removing virtual machines or containers as traffic changes. Instead of guessing a fixed capacity, you define scaling policies that react to metrics like CPU usage, network load, or request counts. An autoscaling group works together with a load balancer so that new instances start receiving traffic as they come online.

Why it matters

Autoscaling helps you handle peak loads without overpaying during quiet periods. It supports both scalability and cost efficiency in cloud computing environments. Without autoscaling, teams often over provision resources to stay safe, which can be expensive and still fail under sudden spikes.

How it works

You define minimum and maximum instance counts and policies that say when to scale out or in. The cloud platform monitors metrics and triggers actions like launching a new virtual machine image or stopping an idle instance. You can see related tradeoffs in the lesson Vertical and Horizontal Scaling.

See More

Further Reading

You need to be signed in to leave a comment and join the discussion