Team CloudKeeper is a collective of certified cloud experts with a passion for empowering businesses to thrive in the cloud.
14
14In recent years, GCP has made strong efforts to compete with AWS, steadily rising in popularity. Efforts such as offering generous credits to attract new users are working well in its favour, as its market share now stands at 12% of the cloud computing market in 2025. It is an appealing choice for organizations already invested in the Google ecosystem.
However, GCP still holds the third spot in terms of cloud market share. AWS benefits from a mature ecosystem that covers almost every use case—ranging from performance optimization and cloud cost optimization to partners who guide customers through discount programs and more.
As GCP adoption continues, so will the challenges common to all cloud infrastructures—bill shocks, runaway costs, and unoptimized cloud spend—yet the resources for addressing these issues in GCP remain limited.
And that’s precisely why, in this edition of Ask the Cloud Expert, we’ll explore the lesser-discussed aspects of GCP cost optimization, drawing from real-world experiences and hands-on expertise shared by our expert.
Puneet Malhotra is a Senior Manager at CloudKeeper, responsible for all of CloudKeeper’s GCP clients as well as CloudKeeper’s partnership with GCP. A seasoned expert with extensive GCP experience across geographies and having delivered value to multiple clients, he’s the best person to walk you through the nitty-gritties of extracting the full potential out of your GCP investment.
So, let’s get started!
There are quite a few reasons and a combination of factors due to which Google Cloud Platform’s (GCP) acceptance, while rising, is still behind AWS and Azure. The situation is fairly complicated.
a) Fear of Pulling Out of GCP: Google has a history of shutting down several major projects — such as Google Reader, Google Hangouts, Google App Engine Standard for Python 2, Google Stadia, and Firebase Dynamic Links. This created uncertainty for enterprises. Companies like Snap and Evernote had to re-architect parts of their stack when Google deprecated certain APIs or products.
This fear was further exacerbated in January 2024 when GCP announced the removal of cloud egress fees. While seen as a positive move, it also triggered apprehension that Google might again shift strategies abruptly.
b) Lack of SES Equivalent and Missing Latency-Based DNS: AWS offers Simple Email Service (SES), a highly scalable email sending and receiving service used for transactional emails, notifications, and marketing campaigns.
GCP lacks a native equivalent, leaving developers to rely on third-party integrations for this critical workload. Similarly, GCP does not yet have a latency-based DNS routing service
c) Late to the Game: By the time GCP seriously picked up pace against competitors, most large enterprises had already locked in their workloads with AWS or Azure. This “first-mover advantage” gap meant that Google had to fight harder to win enterprise trust.
d) Perception of Enterprise Readiness: For a long time, GCP was perceived as more developer/startup-focused, especially strong in data analytics (BigQuery, ML, Kubernetes), but enterprises, however, preferred AWS (and later Azure) because of the breadth of enterprise-ready services, mature compliance/certification coverage, and extensive enterprise sales and support.
However, GCP has worked on these concerns and has carried out many steps to boost consumer confidence, such as
a) Renewed Focus on GCP: Google has significantly increased its investment and focus on Google Cloud Platform.
b) AI & ML Innovation:
Introduction of Tensor Processing Units (TPUs) — custom hardware designed specifically for AI and ML workloads.
Aggressively competing with AWS and artificial intelligence.
Vertex AI platform enables end-to-end ML model training and deployment at a lower cost than AWS SageMaker, while delivering comparable performance.
c) Customer-Centric Investments:
By 2025, most earlier concerns will have been largely addressed, with GCP being widely adopted both as a primary cloud infrastructure and in hybrid settings alongside AWS and Azure.
The most common causes aren't usually one big mistake, but a combination of unchecked automation, lack of cloud infrastructure governance practices, and simple waste piling up. These are the common ones I’ve seen our customers often trying to rectify :
a) Unmonitored Autoscaling: This is the #1 runaway trigger. Autoscaling is amazing until a misconfigured policy or a code bug creates an infinite loop, spinning up thousands of preemptible VMs or instances that you don't notice until the bill arrives. You need to follow the recommended Autoscaling best practices to realise its potential fully.
b) Orphaned and Idle Resources:
c) Lack of Commitment Discounts: Without Committed Use Discounts (CUDs) or Sustained Use Discounts, you're leaving 30-70% in savings on the table.
d) Over-Provisioning ("Right-Sizing" Failure): Developers often overestimate needs, provisioning 8-CPU machines for workloads that barely use 2. You pay for the entire oversized instance, 24/7.
e) Poor Resource Hierarchy & Tagging: If you can't tell which project, team, or product is generating a cost, you can't hold anyone accountable. A flat structure with no labels or tags makes cost allocation and shutdown protocols impossible.
f) Spike in Data Processing or Egress: A sudden, large data analytics job (e.g., a misconfigured BigQuery query) or a spike in data transferred out of Google Cloud (egress) can add thousands of dollars in a single day.
The root cause of most of these is a lack of cloud cost visibility into the GCP infrastructure and governance. Without budget alerts, cost dashboards per team, or policies to delete temporary resources, these small oversights can add thousands of dollars to your cloud spend.
AI and Automation are instrumental if you’re looking to optimize your GCP setup. Especially considering the size of infrastructure in the context of services and instances, a cloud engineer can't be at the dashboard observing the GCP cloud round the clock.
Primarily, there are three use cases where AI-enabled automation tools have taken over:
CloudKeeper Lens is one of the few tools in the market that empowers users with the ability to comprehensively monitor their GCP infrastructure. And CloudKeeper AZ is a well-rounded solution for reducing GCP spend.
We’ve perfected these offerings after continuous feedback from customers, and they continue to deliver tangible value.
The best approach to ongoing optimization is a two-layered strategy:
By combining architectural best practices with intelligent automation, organizations can move from one-time cost-cutting exercises to a culture of sustained cost efficiency, achieving long-term savings while maintaining performance.
Yes, Google offers many discount plans that are driven by key factors such as the duration of commitment, spend on particular services and instances, as well as similar such contracts:
Sustained Use Discounts and Committed Use Discounts are the two discount programs GCP offers. However, before you enter into these pricing models, there are considerations you need to make:
a)Committed Use Discounts: In exchange for making either a spend-based commitment or a resource-usage-based commitment, you get up to 55% off for most resources like machine types or GPUs, and up to 70% off for memory-optimized machine types. There are two variants of Google’s CUD. Those are:
Drawbacks of Committed Use Discounts
b) Sustained Use Discounts: Sustained Use Discounts are applied automatically to your billing account at the end of the billing cycle—the more a particular resource is used during a month, the higher the discount. The maximum discount is up to 30% depending on the machine type.
It is essential that you first gauge your workload, then go in for these discount plans.
c) Discounted Pricing Agreements for Startups and Enterprises: Similar to AWS PPAs, Google also enters into private contracts with enterprises with considerable cloud spend. For startups, Google offers credits if there's strong growth potential. However, these agreements are generally negotiated behind closed doors, and the terms—including discount percentages—are not publicly disclosed.
These are the top discount programs and pricing models through which customers can save on their cloud spend instead of spending multiple times more by provisioning instances on demand.
For an organization just starting, I’d recommend a mix of immediate quick wins and a foundational strategy:
Quick Wins in first 48 Hours:
However, these were quick wins. For ongoing optimization efforts, it is necessary to make corrections at a foundational level—specifically, by sorting the architecture of your GCP infrastructure.
And that foundation can be established by aligning your GCP setup with Google’s Cloud Architecture Framework (formerly called the Well-Architected Framework). The Cloud Architecture Framework provides a comprehensive set of recommendations along with actionable steps to implement them, with the end goal of optimizing performance, security, reliability, and cost efficiency in Google Cloud.
Reorganizing your Google Cloud is essential for long-term cost reduction. Since it forms the foundation, all other cost optimization gains remain temporary without it. The Architecture Framework review document helps cloud architects, developers, and admins design robust architectures and simplify the administration of Google Cloud resources.
A lower GCP spend is what every organization strives for. However, it’s essential not to overstep and cut costs on resources you actually need. Think of cost optimization as maximizing ROI per dollar spent on cloud, rather than simply “cutting down cloud spend.”
Here are some tried-and-tested strategies to follow for managing your GCP bill:
1. Use Spot VMs for non-critical workloads
Compared to the on-demand pricing of GCP instances, you can save up to 91% with Spot VMs, depending on configurations and region. The caveat, however, is that Google can reclaim these resources with just 30 seconds’ notice. To mitigate this, make sure you save snapshots so tasks can be resumed when the instance becomes available again.
Additionally, creating instance groups of Spot VMs increases your chances of securing the required machines for your workload.
2. Rightsize your VMs
Rightsizing is one of the most critical steps in optimizing cloud spend, but it requires careful planning. Rushing this process can result in excessive downsizing, leading to performance bottlenecks and application crashes.
Key considerations when rightsizing VMs include:
3. Enforce a standardized tagging policy
Establish a clear, organization-wide tagging policy for all GCP resources (e.g., by project, environment, or department). Standardized tagging improves accountability and visibility, making it easier to track resource usage and enforce proactive cost optimization strategies.
4. Leverage GCP recommendations
Like other cloud providers, GCP regularly publishes recommendations, best practices, and updates on new instance types or services. Staying on top of these ensures you adopt the latest cost-saving measures and configurations.
5. Use Autoscaling: Rather than having instances at maximum capacity all the time, use autoscaling to scale according to actual traffic. GCP's load balancers maintain performance while helping with cost savings—another two key pillars of GCP Optimization.
By implementing these strategies, you can achieve sustainable cost reductions on your GCP infrastructure while ensuring performance and reliability remain intact.
The biggest mistakes stem from a reactive, overly aggressive mindset that alienates engineers and misses the bigger picture.
Not Tagging Resources from Day One: Launching resources without labels or tags makes it impossible to answer the question, "Who owns this cost?" This single failure cripples accountability.
The core mistake is prioritizing cost-cutting over cost intelligence. The goal isn't just to reduce the bill; it's to understand why the bill is what it is and spend smarter.
You can't manage what you don't measure. These metrics move you from guessing to knowing.
Track these in Google Cloud's Billing Reports and set up Budget Alerts on them to catch anomalies before they become catastrophes.
Optimizing a GCP infrastructure can be more challenging than optimizing with other cloud providers, such as AWS, mainly due to the smaller number of resources, fewer third-party optimization service providers, and the overall lack of awareness of the GCP ecosystem.
However, the fundamentals remain the same: right-sizing, visibility, and the use of discount plans. While these fundamentals are consistent across the cloud ecosystem, it is essential to have nuanced, hands-on knowledge of GCP and its ecosystem to get the most out of any optimization effort.
CloudKeeper is your end-to-end solutions partner for maximizing your ROI in GCP. We bring multiple competencies and partner programs, along with 60+ highly skilled and certified practitioners and consultants who will guide you through every step of your Google Cloud journey.
These are the Google Cloud services CloudKeeper offers:
Get in touch with our GCP experts today!
Speak with our advisors to learn how you can take control of your Cloud Cost