The FinOps Optimize Phase: Ensuring Cloud Cost Optimization

Q: What is Cloud Cost Optimization and why does it matter?

Cloud Cost Optimization is the continuous practice of minimizing cloud expenses without compromising performance. It involves matching resource capacity to actual workload needs, automating resource scaling, and strategically choosing between on-demand, reserved, and spot instances. This proactive approach prevents overspending, reduces waste from idle resources, and ensures cloud infrastructure is cost-effective and aligned with business growth.

Q: How does Rightsizing Cloud Resources reduce cloud costs?

Rightsizing Cloud Resources means adjusting the size and type of compute, storage, and network services to fit actual workload requirements. Many organizations over-provision by default, leading to waste. By profiling applications, using analytics tools, and monitoring usage regularly, teams can eliminate underutilized or oversized resources, typically saving 30-40% of cloud spend associated with idle capacity.

Q: What is the role of Autoscaling and Kubernetes Autoscaler in cost optimization?

Autoscaling and Kubernetes Autoscaler help manage variable workloads by dynamically adjusting cloud resource capacity in real time. Autoscaling automatically adds or removes instances based on demand, while Kubernetes’ Horizontal Pod Autoscaler and Cluster Autoscaler manage containerized workloads and node scaling. These tools prevent overprovisioning, reduce manual intervention, and optimize costs by scaling resources only when needed.

Q: How do Reserved Instances and Spot Instances contribute to cloud savings?

Reserved Instances offer up to 72% savings over on-demand pricing by committing to long-term usage for predictable workloads. Spot Instances provide up to 90% discounts by leveraging unused cloud capacity but carry the risk of sudden termination. Using a hybrid strategy—running baseline workloads on Reserved Instances and flexible workloads on Spot Instances—maximizes cost efficiency while maintaining operational reliability. Ask ChatGPT

Published on

13 Mar 25

‍

The Benefits and Challenges of Spot Instances: Maximizing Cost Efficiency in the Cloud

In Phase 1, we showed you how to set up your cloud environment for monitoring. We then explained how to create effective cloud cost intelligence dashboards. Doing this enabled you to:

Identify major cost contributors and then cut sources of idle compute or excess cloud spending.
Identify where workload leads to higher earnings and then devote more resources there.

These one-time manual fixes can lead to significant cloud cost savings. However, ideally, you shouldn't have to rely on accurate forecasting and manual intervention. Instead, your resources should scale up and down automatically based on your business’s changing needs. This article will explain how to achieve that by:

Rightsizing and reconfiguring resources to match your baseline workload requirements.
Automating cloud resource provisioning with Kubernetes and autoscaling to dynamically adjust resources based on workload demands.
Optimizing costs by identifying and switching to more cost-effective cloud solutions, such as alternative instance types, based on workload predictability and variability.

1. Cut Idle Cloud Spend By Rightsizing Resources

Your target cloud resource utilization rate should be 70-80%. But most companies are not even close to hitting that. On average, 30-40% of instances are over-provisioned, and some studies show even worse. So for every $1 spent on cloud computing services, at least $0.30-$0.40 is wasted on unused capacity.

Rightsizing your cloud infrastructure involves matching the appropriate size and type of resources to actual baseline workload requirements, so that you never pay for more than what you need.

Use your dashboards to regularly conduct thorough inventories of all existing cloud resources. Pinpoint instances and services that are either very underutilized or consistently over-provisioned.
Profile each application to understand its specific resource requirements, performance criteria, and workload patterns. Some applications might need large amounts of memory but relatively little compute power, while others might need high compute but minimal storage. By capturing these unique usage profiles, you’re better equipped to select the best instance types, storage classes, and configurations for each application and cut further unnecessary cloud expenses.
Leverage analytics and recommendation tools like AWS Compute Optimizer, Microsoft Azure Advisor, and Google Cloud’s Recommender analyze your cloud cost data and provide actionable optimization recommendations—whether that means terminating instances that are barely utilized, consolidating storage across services, or shifting workloads to more cost-effective cloud operations.

2. Scale Cloud Utilization By Automating Resource Provisioning

Under-provisioning cloud computing resources for a critical application can be more costly than over-provisioning, as it can result in degraded performance or even downtime. For example, if you're running an e-commerce business, having your site crash on Black Friday because you don't have enough servers is going to cost a lot more than what you'd save on cloud operating costs.

Therefore, after rightsizing your baseline workloads, you should consider setting up auto-scaling for anything variable. This is especially important if you have fluctuating demand, for example if you are a seasonal ecommerce business.

Automated resource provisioning can refer to:

Autoscaling – where capacity adjusts in real-time to match workload demands.
Predictive scaling – where resources are preemptively allocated or decommissioned based on historical usage patterns.
Automated decommissioning – where unused resources are promptly terminated to avoid unnecessary cloud expenses.

A step further is implementing Kubernetes, which optimizes compute resource allocation by matching workload requirements rather than simply scaling the same resources up and down.

The Horizontal Pod Autoscaler (HPA) automatically adjusts the number of running pods based on metrics like CPU and memory usage, ensuring applications have the resources they need without manual intervention.
The Kubernetes Cluster Autoscaler complements this by adding or removing nodes in the cluster based on overall demand. This allows you to elastically match the right type of compute to your workloads.

‍

3. Optimize Cloud Costs with Appropriate Service Types

A third way to drive cloud savings is to review and choose more appropriate instance types. Start by profiling workloads to understand their specific resource needs and performance requirements. This will help to select the most suitable cloud services and configurations.

Two instance types you should consider are Reserved Instances (RIs), and Spot Instances (SI).

Reserved instances:

Reserved instances provide savings of up to 72% compared with on-demand in exchange for a long-term commitment and guaranteed availability. These are suitable for steady, predictable workloads that require guaranteed uptime.
Generally, if your team is defaulting to on-demand pay-as-you-go instances, you should move your baseline workload to reserved instances. In fact, given the 70% discount, you likely want to over-provision reserved instances at the margin and let them run idle vs. paying for on-demand (i.e. you only need 30% utilization to break even, so you can be aggressive here).
You could also look at cheaper options. You can buy reserved capacity from resellers like Stratalux and Strategic Blue. They get bulk pricing for reserved instances and handle their own cloud management and price consolidation. You can use a company like Archera to get insurance against canceling unused RIs. This is helpful if you are worried about forecasting your needs accurately.

Spot instances:

Spot Instances involve bidding on excess or unused capacity available with AWS, and paying only for the compute you use. While Spot Instance prices fluctuate based on the supply and demand of the current market, the prices are usually less than On-Demand and Reserved Instances.
With Spot, you can get up to 90% off with no required long-term commitment. The catch is that if your bid for the Spot instance is outbidded, or if the capacity demand increases, AWS can terminate your Spot instances almost instantly
When AWS wants to reclaim a Spot Instance, it will send a two-minute warning through CloudWatch Events and instance metadata. You can use these two minutes to save the application state, upload log files, or drain any presently running containers.
Spot instances are cost-effective for interruptible workloads like batch processing or development testing. To assess the risk of interruptability, use tools like AWS’s pricing history and monitoring features to analyze spot price trends for your preferred instance type and region. And to mitigate it, set alerts for price thresholds or even negotiate for priority access.

Use your FinOps dashboards from Phase 1 to review your workloads and instance types with your engineering team, as there are most likely more price efficient combinations to consider. A hybrid approach will work best; reserved instances handle baseline capacity reliably, while spot instances cover fluctuating demands. This combination ensures cost efficiency while maintaining operational flexibility and resilience.

You may want to automatically balance between reserved and spot instances. Unfortunately autoscaling in AWS does not natively support switching between instance types (i.e. reserved to spot instances) within a single Auto Scaling Group (ASG). However, you can achieve this by using multiple ASGs:

Separate ASGs for different instance types: create one ASG for reserved or on-demand instances, and another for spot instances. You can configure scaling policies and alarms to manage these groups independently.
EC2 Fleet or Spot Fleet: Alternatively, you can use EC2 Fleet or Spot Fleet, which allows mixing instance types and pricing models (e.g. spot and on-demand) in a single configuration.

‍

***

Summary: Cloud Cost Optimization 101

Cloud cost optimization is an ongoing process that involves selecting appropriate resource types, and leveraging automation to balance performance and cost efficiency. Begin by discussing the following with your engineering team:

Rightsize Your Resources: Are you continuously monitoring application demands? Are you eliminating overprovisioned or underutilized instances for baseline cloud-based workloads?
Automate Provisioning: Do you use autoscaling (or Kubernetes HPA/Cluster Autoscaler) to dynamically match resource capacity with real-time usage?
Leverage the Right Instance Types: Are there more price efficient combinations of instance types and cloud resources to consider?
Adopt a Hybrid Approach: Do you effectively balance On-Demand, Reserved, and Spot Instances to optimize for both reliability and cost?

Asking these questions to your engineering team should uncover the source and solution to your growing cloud spend. But achieving lasting cost optimization requires a company-wide cultural shift. Subscribe to our newsletter for Phase 3, where we explain how to build a company-wide FinOps mindset and embed cost-conscious practices into your organization.

Frequently Asked Questions

Cloud Cost Optimization is the continuous practice of minimizing cloud expenses without compromising performance. It involves matching resource capacity to actual workload needs, automating resource scaling, and strategically choosing between on-demand, reserved, and spot instances. This proactive approach prevents overspending, reduces waste from idle resources, and ensures cloud infrastructure is cost-effective and aligned with business growth.

Rightsizing Cloud Resources means adjusting the size and type of compute, storage, and network services to fit actual workload requirements. Many organizations over-provision by default, leading to waste. By profiling applications, using analytics tools, and monitoring usage regularly, teams can eliminate underutilized or oversized resources, typically saving 30-40% of cloud spend associated with idle capacity.

Autoscaling and Kubernetes Autoscaler help manage variable workloads by dynamically adjusting cloud resource capacity in real time. Autoscaling automatically adds or removes instances based on demand, while Kubernetes’ Horizontal Pod Autoscaler and Cluster Autoscaler manage containerized workloads and node scaling. These tools prevent overprovisioning, reduce manual intervention, and optimize costs by scaling resources only when needed.

Reserved Instances offer up to 72% savings over on-demand pricing by committing to long-term usage for predictable workloads. Spot Instances provide up to 90% discounts by leveraging unused cloud capacity but carry the risk of sudden termination. Using a hybrid strategy—running baseline workloads on Reserved Instances and flexible workloads on Spot Instances—maximizes cost efficiency while maintaining operational reliability.