Director of Engineering- Kubernetes
12-15 years Noida

The Opportunity

We are seeking a Platform Specialist (Director level) to serve as the organization's top technical authority on Kubernetes and the most senior hands-on engineer for CK-Kube, our Kubernetes Cost Intelligence platform. This is a deep individual contributor role — 60%+ hands-on engineering — where you will architect, implement, and technically lead CK-Kube as the principal engineer. You will set the technical direction, write production code, and drive architectural decisions. We are not looking for a people manager; we are looking for the strongest Kubernetes systems engineer we can find.

What You'll Own

CK-Tuner-Kubernetes — Kubernetes Cost Intelligence Platform

  • Architect and implement the cost allocation engine — cluster, namespace, deployment, pod, and container granularity across EKS, AKS, and GKE

  • Design and build the real-time data collection pipeline: agent architecture, ClickHouse time-series storage, gRPC streaming between agent and datastore

  • Implement Karpenter integration for node lifecycle management and bin-packing optimization

  • Build custom Kubernetes controllers and operators for cost policy enforcement and automated remediation

  • Design shared cost distribution algorithms — system namespaces, control plane costs, networking overhead, idle capacity attribution

  • Integrate CK-Tuner-Kubernetes with CK-Lens for a unified cloud + container cost view

Container Optimization Engine

  • Design and implement container right-sizing algorithms for CPU and memory requests/limits based on real usage patterns

  • Build node pool optimization logic — instance type selection, scaling policies, bin-packing efficiency scoring

  • Implement Karpenter-based spot and preemptible node policies for fault-tolerant workloads

  • Build the automated right-sizing execution pipeline via CK-Tuner integration

GPU Container Cost Intelligence

  • GPU utilization tracking and idle GPU detection for AI/ML workloads running on Kubernetes

  • Multi-cluster GPU cost comparison across EKS, AKS, and GKE

  • Integration with the FinOps for AI initiative for GPU pod-level cost attribution

Responsibilities

Technical Leadership

  • Serve as CK-Tuner-Kubernetes's principal architect and most senior hands-on engineer

  • Set architectural standards and code quality bars; mentor engineers through technical pairing and design reviews

  • Drive technical roadmap and architecture decisions in partnership with Product Management

Hands-On Engineering

  • Write production Go code for CK-Tuner-Kubernetes's core systems: agent data collection, metrics processing, cost allocation engine

  • Design and implement custom Kubernetes controllers and operators

  • Build and optimize the ClickHouse time-series data model for cost metrics at scale

  • Implement gRPC streaming with backpressure, circuit breakers, and mTLS between agent and datastore

  • Develop Karpenter-based node optimization policies and consolidation algorithms

  • Performance-tune the metrics pipeline: 10-second scrape intervals, 1-minute rollups, multi-cluster aggregation

Technical Strategy

  • Design the agent data collection layer — hybrid metrics collection via Metrics API, Kubelet Summary, Kubelet Proxy, and optional Prometheus endpoints

  • Architect the ClickHouse time-series schema with materialized views for multi-resolution aggregation (5m, 1h, 1d)

  • Build the delta processing pipeline — in-memory state comparison with ring buffers (discovery 10K, metrics 50K, events 100K)

  • Design cost allocation algorithms for shared resources — control plane, networking, system namespaces, idle capacity

  • Architect multi-cloud Kubernetes support (EKS primary, AKS/GKE Phase 4) with provider-specific pricing API integrations

  • Build integration points with CK-Lens, CK-Tuner, and CK-Intelligence

Technical Landscape You'll Navigate

Kubernetes & Container Orchestration

  • Platforms: EKS (Fargate, managed node groups), AKS, GKE (Autopilot, standard), on-prem Kubernetes

  • Ecosystem: OpenCost, Karpenter, Helm, Kubernetes Operators, K8s API Server

  • Resource Management: Requests/limits, node autoscaling, pod scheduling, bin-packing, spot/preemptible nodes

  • Kubernetes Internals: Custom controllers, operators, CRDs, admission webhooks, scheduler plugins, informers, leader election, reconciliation loops

Data Engineering

  • ClickHouse (time-series analytics), Apache Pulsar/NATS JetStream (message broker), gRPC bidirectional streaming with backpressure

Cloud Providers

  • AWS: EKS, Fargate, EC2 (GPU instances), S3, CloudWatch, Cost & Usage Reports

  • Azure: AKS, Azure Monitor, Azure Billing APIs

  • GCP: GKE, GKE Autopilot, BigQuery Billing Export



Requirements

Experience

  • 10+ years in systems/platform/infrastructure engineering with deep hands-on Kubernetes production experience (EKS, AKS, or GKE)

  • Track record of personally designing and implementing complex distributed systems — not just overseeing teams that build them

  • Experience building Kubernetes tooling: operators, controllers, CLI tools, or platform products

  • Prior work on cost/resource optimization, observability, or infrastructure intelligence platforms preferred

  • Experience with container orchestration at scale — multi-cluster, multi-cloud preferred

Technical Depth

  • Expert-level: Kubernetes internals (scheduler, controller-manager, kubelet, API server), resource management, pod lifecycle

  • Hands-on: Custom controller/operator development using controller-runtime or client-go

  • Production experience with Karpenter, OpenCost, or equivalent node/cost optimization tools

Director of Engineering- Kubernetes
Upload CV*
Only .doc, .docx or .pdf file accepted

Speak with our advisors to learn how you can take control of your Cloud Cost