Cluster Autoscaler automatically adjusts the number of nodes in your Amazon EKS cluster based on workload demand. When pods fail to schedule due to insufficient resources, the autoscaler adds nodes. When nodes sit underutilized, it removes them to reduce cost. This keeps your cluster right-sized without manual intervention.
This guide walks through setting up Kubernetes Cluster Autoscaler on an EKS cluster with managed node groups. We cover IAM policy creation, OIDC-based role assignment, Helm deployment, scaling tests, multi-node-group configuration, and Karpenter as a modern alternative.
Prerequisites
Before starting, make sure you have the following in place:
- A running EKS cluster (Kubernetes 1.29+) with at least one managed node group
kubectlconfigured to access the clusterawsCLI v2 installed and configured with admin or IAM permissionshelmv3 installed- An OIDC identity provider associated with your EKS cluster
eksctlinstalled (optional, simplifies OIDC and IAM role creation)
Confirm your cluster is accessible and note the cluster name – you will need it throughout this guide:
kubectl cluster-info
You should see the Kubernetes control plane endpoint for your EKS cluster:
Kubernetes control plane is running at https://ABCDEF1234567890.gr7.us-east-1.eks.amazonaws.com
CoreDNS is running at https://ABCDEF1234567890.gr7.us-east-1.eks.amazonaws.com/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Check that you have an OIDC provider associated with your cluster. If the command below returns an OIDC issuer URL, you are set:
aws eks describe-cluster --name my-cluster --query "cluster.identity.oidc.issuer" --output text
If no OIDC provider exists yet, create one with eksctl:
eksctl utils associate-iam-oidc-provider --cluster my-cluster --approve
Step 1: Create IAM Policy for Cluster Autoscaler
The Cluster Autoscaler needs permissions to describe Auto Scaling groups, modify desired capacity, and terminate instances. Create a dedicated IAM policy with the minimum required permissions.
Save the following policy document to a file:
cat > /tmp/cluster-autoscaler-policy.json << 'EOF'
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeScalingActivities",
"autoscaling:DescribeTags",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup",
"ec2:DescribeLaunchTemplateVersions",
"ec2:DescribeInstanceTypes",
"ec2:DescribeImages",
"ec2:GetInstanceTypesFromInstanceRequirements",
"eks:DescribeNodegroup"
],
"Resource": "*"
}
]
}
EOF
Create the IAM policy using the AWS CLI:
aws iam create-policy \
--policy-name ClusterAutoscalerPolicy \
--policy-document file:///tmp/cluster-autoscaler-policy.json
The output includes the policy ARN. Save it - you need it in the next step:
{
"Policy": {
"PolicyName": "ClusterAutoscalerPolicy",
"PolicyId": "ANPAJ2UCCR6DPCEXAMPLE",
"Arn": "arn:aws:iam::123456789012:policy/ClusterAutoscalerPolicy",
"Path": "/",
"DefaultVersionId": "v1",
"AttachmentCount": 0,
"CreateDate": "2026-03-22T10:00:00Z"
}
}
Export the ARN as a variable for the remaining steps:
export POLICY_ARN="arn:aws:iam::123456789012:policy/ClusterAutoscalerPolicy"
Step 2: Create IAM Role with OIDC for the Autoscaler
EKS uses IAM Roles for Service Accounts (IRSA) to grant pods specific AWS permissions without sharing node-level credentials. This maps a Kubernetes service account to an IAM role through the cluster's OIDC provider.
The simplest approach uses eksctl to create the role and service account in one command:
eksctl create iamserviceaccount \
--cluster=my-cluster \
--namespace=kube-system \
--name=cluster-autoscaler \
--attach-policy-arn=$POLICY_ARN \
--override-existing-serviceaccounts \
--approve
This command creates a CloudFormation stack that provisions the IAM role, attaches the policy, and creates the Kubernetes service account with the proper annotation. Verify it completed successfully:
kubectl get serviceaccount cluster-autoscaler -n kube-system -o yaml
The output should show the eks.amazonaws.com/role-arn annotation pointing to the new IAM role:
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/eksctl-my-cluster-addon-iamserviceaccount-Role1-EXAMPLE
name: cluster-autoscaler
namespace: kube-system
Manual IAM role creation (without eksctl)
If you prefer not to use eksctl, create the trust policy and IAM role manually. First, get your OIDC provider ID:
OIDC_ID=$(aws eks describe-cluster --name my-cluster --query "cluster.identity.oidc.issuer" --output text | sed 's|https://||')
AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
Create the trust policy document:
cat > /tmp/trust-policy.json << EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_ID}"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"${OIDC_ID}:aud": "sts.amazonaws.com",
"${OIDC_ID}:sub": "system:serviceaccount:kube-system:cluster-autoscaler"
}
}
}
]
}
EOF
Create the IAM role and attach the policy:
aws iam create-role \
--role-name ClusterAutoscalerRole \
--assume-role-policy-document file:///tmp/trust-policy.json
aws iam attach-role-policy \
--role-name ClusterAutoscalerRole \
--policy-arn $POLICY_ARN
Then create the Kubernetes service account with the role annotation:
kubectl create serviceaccount cluster-autoscaler -n kube-system
kubectl annotate serviceaccount cluster-autoscaler \
-n kube-system \
eks.amazonaws.com/role-arn=arn:aws:iam::${AWS_ACCOUNT_ID}:role/ClusterAutoscalerRole
Step 3: Deploy Cluster Autoscaler with Helm
Helm is the recommended way to deploy and manage the Cluster Autoscaler on EKS. Add the autoscaler Helm repository and install the chart.
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm repo update
Install the Cluster Autoscaler chart. Replace my-cluster with your actual EKS cluster name and set the image tag to match your Kubernetes version:
helm install cluster-autoscaler autoscaler/cluster-autoscaler \
--namespace kube-system \
--set autoDiscovery.clusterName=my-cluster \
--set awsRegion=us-east-1 \
--set rbac.serviceAccount.create=false \
--set rbac.serviceAccount.name=cluster-autoscaler \
--set extraArgs.balance-similar-node-groups=true \
--set extraArgs.skip-nodes-with-system-pods=false
The autoDiscovery.clusterName flag tells the autoscaler to discover Auto Scaling groups tagged with k8s.io/cluster-autoscaler/my-cluster. EKS managed node groups add these tags automatically.
Verify the autoscaler pod is running:
kubectl get pods -n kube-system -l app.kubernetes.io/name=aws-cluster-autoscaler
You should see one pod in Running state:
NAME READY STATUS RESTARTS AGE
cluster-autoscaler-6b5d4c7f9d-x2kmp 1/1 Running 0 45s
Deploy with YAML manifest (alternative)
If you prefer not to use Helm, download the official manifest and customize it. This approach gives full control over the deployment spec:
curl -sLO https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
Edit the downloaded file to set your cluster name. Find the --node-group-auto-discovery line and replace <YOUR CLUSTER NAME> with your actual cluster name. Also update the service account to use the one created in Step 2 (remove the ServiceAccount resource from the manifest since it already exists). Then apply:
kubectl apply -f cluster-autoscaler-autodiscover.yaml
Step 4: Configure Autoscaler Behavior
The default settings work for most clusters, but production environments benefit from tuning. Key parameters control how aggressively the autoscaler scales down and which nodes it protects.
Here are the most important flags and what they do:
| Flag | Default | Description |
|---|---|---|
--scale-down-delay-after-add | 10m | Wait time after a scale-up before considering scale-down |
--scale-down-delay-after-delete | 0s | Wait time after node removal before more scale-down |
--scale-down-unneeded-time | 10m | How long a node must be unneeded before removal |
--scale-down-utilization-threshold | 0.5 | Node utilization below this triggers scale-down consideration |
--skip-nodes-with-local-storage | true | Prevent removing nodes with local PVs |
--skip-nodes-with-system-pods | true | Prevent removing nodes running kube-system pods |
--max-graceful-termination-sec | 600 | Max wait for pod graceful shutdown during scale-down |
To apply custom settings, upgrade the Helm release with additional flags:
helm upgrade cluster-autoscaler autoscaler/cluster-autoscaler \
--namespace kube-system \
--set autoDiscovery.clusterName=my-cluster \
--set awsRegion=us-east-1 \
--set rbac.serviceAccount.create=false \
--set rbac.serviceAccount.name=cluster-autoscaler \
--set extraArgs.balance-similar-node-groups=true \
--set extraArgs.skip-nodes-with-system-pods=false \
--set extraArgs.scale-down-delay-after-add=5m \
--set extraArgs.scale-down-unneeded-time=5m \
--set extraArgs.scale-down-utilization-threshold=0.65 \
--set extraArgs.max-graceful-termination-sec=300
For production clusters handling critical workloads, consider setting --scale-down-delay-after-add to 15m or higher to avoid thrashing during deployment rollouts.
Step 5: Test Scaling Up
Deploy a workload that requests more resources than the current nodes can handle. This forces the autoscaler to add nodes. Create a deployment with high CPU requests:
kubectl create deployment scale-test \
--image=nginx \
--replicas=30 \
-- /bin/sh -c "while true; do sleep 3600; done"
kubectl set resources deployment scale-test \
--requests=cpu=500m,memory=256Mi
Watch the pods - some should enter Pending state because the cluster lacks capacity:
kubectl get pods -l app=scale-test --watch
Within 30-60 seconds, the autoscaler detects unschedulable pods and requests new nodes from the Auto Scaling group. Monitor the autoscaler's decisions:
kubectl -n kube-system logs -l app.kubernetes.io/name=aws-cluster-autoscaler --tail=50
You should see log entries indicating scale-up activity:
I0322 10:05:32.123456 Scale-up: setting group eks-my-nodegroup size to 5
I0322 10:05:33.234567 Nodes added: 2
Check the node count to confirm new nodes joined the cluster:
kubectl get nodes
The new nodes appear in NotReady state initially, then transition to Ready within 1-2 minutes as kubelet starts and joins the cluster.
Step 6: Test Scaling Down
Remove the test workload and observe the autoscaler reclaiming unused nodes:
kubectl delete deployment scale-test
After the pods terminate, the freed nodes become underutilized. The autoscaler waits for the scale-down-unneeded-time duration (default 10 minutes) before removing them. Monitor the process:
kubectl -n kube-system logs -l app.kubernetes.io/name=aws-cluster-autoscaler -f | grep -i "scale.down"
You should see log lines showing nodes marked as unneeded, then eventually removed:
I0322 10:20:45.123456 ip-10-0-1-50.ec2.internal is unneeded since 2026-03-22 10:15:45
I0322 10:25:46.234567 Scale-down: removing node ip-10-0-1-50.ec2.internal
Verify the node count decreased:
kubectl get nodes
The cluster should return to the original number of nodes in your managed node group's minimum size.
Step 7: Configure Multiple Node Groups
Production EKS clusters often run multiple node groups - general workloads on cost-effective instances and GPU or memory-intensive workloads on specialized instances. The Cluster Autoscaler handles this through auto-discovery, which detects all node groups tagged for your cluster.
If you already used autoDiscovery.clusterName in Step 3, the autoscaler picks up all managed node groups automatically. Verify by checking the configmap:
kubectl -n kube-system get configmap cluster-autoscaler-status -o yaml
For fine-grained control over individual node groups, set min and max sizes through node group tags or the --nodes flag. Create a second node group with eksctl for compute-intensive workloads:
eksctl create nodegroup \
--cluster my-cluster \
--name compute-ng \
--node-type c5.2xlarge \
--nodes-min 0 \
--nodes-max 10 \
--node-labels workload=compute \
--asg-access
Setting nodes-min=0 allows the autoscaler to scale this group to zero when no compute workloads are running - a good way to control costs for bursty workloads.
The balance-similar-node-groups flag (set in Step 3) distributes pods evenly across node groups with the same instance type and labels. This improves availability across Availability Zones.
Use node affinity or taints to direct workloads to specific node groups:
kubectl taint nodes -l workload=compute dedicated=compute:NoSchedule
Pods targeting the compute node group need a matching toleration in their spec. This prevents general workloads from consuming specialized capacity.
Step 8: Karpenter as a Modern Alternative
Karpenter is a Kubernetes node provisioner built by AWS that replaces the Cluster Autoscaler for EKS workloads. Instead of scaling existing Auto Scaling groups, Karpenter provisions right-sized EC2 instances directly based on pending pod requirements. This removes the need to pre-configure node groups for every instance type.
Key differences between Cluster Autoscaler and Karpenter:
| Feature | Cluster Autoscaler | Karpenter |
|---|---|---|
| Scaling method | Adjusts ASG desired count | Launches EC2 instances directly |
| Instance selection | Fixed per node group | Dynamic - picks best fit per pod |
| Scale-up speed | 30-60 seconds | Under 30 seconds typically |
| Node group management | Required | Not needed - uses NodePool CRDs |
| Spot instance handling | Per node group config | Built-in consolidation and diversification |
| Scale to zero | Supported | Native |
To install Karpenter on an existing EKS cluster, first set up the required IAM resources. Karpenter needs its own set of permissions to launch and manage EC2 instances. AWS provides a CloudFormation template to bootstrap these:
export KARPENTER_VERSION="1.3.0"
export CLUSTER_NAME="my-cluster"
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
curl -fsSL "https://raw.githubusercontent.com/aws/karpenter-provider-aws/v${KARPENTER_VERSION}/website/content/en/docs/getting-started/getting-started-with-karpenter/cloudformation.yaml" > /tmp/karpenter-cfn.yaml
aws cloudformation deploy \
--stack-name "Karpenter-${CLUSTER_NAME}" \
--template-file /tmp/karpenter-cfn.yaml \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides "ClusterName=${CLUSTER_NAME}"
Install Karpenter with Helm:
helm registry logout public.ecr.aws
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
--version "${KARPENTER_VERSION}" \
--namespace kube-system \
--set "settings.clusterName=${CLUSTER_NAME}" \
--set "settings.interruptionQueue=Karpenter-${CLUSTER_NAME}" \
--wait
Create a NodePool and EC2NodeClass to define what instances Karpenter can provision:
cat << 'EOF' | kubectl apply -f -
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: default
spec:
template:
spec:
requirements:
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand", "spot"]
- key: node.kubernetes.io/instance-type
operator: In
values: ["m5.large", "m5.xlarge", "m5.2xlarge", "c5.large", "c5.xlarge"]
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: default
limits:
cpu: "100"
memory: 400Gi
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 1m
---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: default
spec:
amiSelectorTerms:
- alias: al2023@latest
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: my-cluster
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: my-cluster
role: "KarpenterNodeRole-my-cluster"
EOF
If you switch to Karpenter, remove the Cluster Autoscaler deployment first to avoid conflicts - both tools should not run simultaneously.
Step 9: Monitor Cluster Autoscaler Logs and Metrics
The Cluster Autoscaler exposes Prometheus metrics and writes detailed logs that help troubleshoot scaling decisions. If you have Kubernetes Metrics Server installed, you can correlate resource utilization with autoscaler activity.
View the autoscaler logs to track scaling events and errors:
kubectl -n kube-system logs deployment/cluster-autoscaler --tail=100
Common log messages and what they mean:
- Pod not schedulable - a pod cannot be placed on existing nodes, triggering scale-up evaluation
- Scale-up: setting group size - the autoscaler increased the desired count of an Auto Scaling group
- Node is unneeded - a node's utilization dropped below the threshold and the countdown to removal started
- Scale-down: removing node - the autoscaler terminated an underutilized node
- Pod cannot be moved - scale-down blocked because a pod has PDB or local storage preventing eviction
The autoscaler also writes its status to a ConfigMap you can inspect:
kubectl -n kube-system get configmap cluster-autoscaler-status -o yaml
This ConfigMap shows the current health of the autoscaler, last scale-up and scale-down events, and any errors. Check it when scaling is not working as expected.
For Prometheus-based monitoring, the autoscaler exposes metrics on port 8085 by default. Key metrics to track:
cluster_autoscaler_scaled_up_nodes_total- total nodes addedcluster_autoscaler_scaled_down_nodes_total- total nodes removedcluster_autoscaler_unschedulable_pods_count- current pending podscluster_autoscaler_function_duration_seconds- autoscaler loop timing
Create a ServiceMonitor to scrape these metrics if you run the Prometheus stack in your cluster:
cat << 'EOF' | kubectl apply -f -
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: cluster-autoscaler
namespace: kube-system
spec:
selector:
matchLabels:
app.kubernetes.io/name: aws-cluster-autoscaler
endpoints:
- port: http
interval: 30s
EOF
Conclusion
You now have the Cluster Autoscaler running on your EKS cluster, automatically adjusting node count based on workload demand. The IAM policy and OIDC-based role give the autoscaler secure, scoped access to manage Auto Scaling groups without broad node-level permissions.
For production clusters, tune the scale-down parameters to match your workload patterns, set up Pod Disruption Budgets to protect critical services during scale-down, and monitor autoscaler metrics through Prometheus and Grafana dashboards. If your workloads have diverse instance type requirements or you need faster scaling, evaluate Karpenter as a replacement.