Cloud Engineer
Manish is an AWS-focused expert known for optimizing infrastructure performance, controlling costs, and designing secure and reliable cloud solutions.
If you’ve ever scaled an Amazon EKS cluster and suddenly found pods stuck in “Pending” with cryptic error messages, there’s a good chance you’ve run into subnet IP exhaustion. It’s one of those issues that doesn’t show up in small dev clusters, but once your workloads grow, it sneaks up fast and brings everything to a halt.
Let’s break down what’s really happening, why it matters, and how you can keep your clusters from running out of IPs.
Here’s the story. You’re running an Amazon EKS cluster, things are humming along, then a deploy happens, and pods just won’t start.
You check the events:
At first glance, it looks like a Kubernetes scheduling issue. In reality, it’s not Kubernetes failing. It’s the underlying VPC subnets that have run out of usable IPs.
By default, AWS EKS uses the AWS VPC CNI plugin. That means:
This is great for native VPC networking; pods can talk to anything in the VPC without NAT or overlays. The downside? You’re burning through subnet IPs at the same pace you’re spinning up pods.
Here’s the kicker:
If you’re already stuck, here are the emergency levers:
Expand your subnets
a) Add larger subnets (/19 or /20 instead of /24).
b) Or attach secondary CIDR blocks to the VPC and create new subnets.
c) This buys you breathing room, but it’s really just a band-aid.
Spread pods across multiple subnets
a) Make sure your node groups are using more than one subnet per AZ.
b) This balances IP usage and avoids concentrating pressure on a single small subnet.
If you don’t want to keep playing whack-a-mole with subnet sizes, here are better approaches:
Subnet exhaustion is sneaky, but it leaves clues:
Proactive monitoring beats getting paged when half your pods are stuck in Pending.
Subnet IP exhaustion in Amazon EKS isn’t a bug; it’s just how the VPC CNI works. The problem is that most of us don’t think about subnet sizing until it breaks production.
The good news: once you understand how pods consume IPs, you can design around it. Start with larger CIDRs, enable prefix delegation, and keep monitoring subnet usage. That way, your scaling story is about smooth growth, not a surprise bottleneck.
Speak with our advisors to learn how you can take control of your Cloud Cost