Amazon Elastic Kubernetes Service (EKS) is AWS’s managed Kubernetes platform that handles the control plane, etcd storage, and API server availability for you. Terraform (and its open-source fork OpenTofu) makes it possible to define your entire EKS infrastructure as code – VPC, subnets, IAM roles, node groups, and add-ons – in declarative configuration files that can be version-controlled and repeated across environments.
This guide walks through deploying a production-ready Amazon EKS cluster using Terraform on AWS. We will set up a custom VPC with public and private subnets, deploy EKS with managed node groups, configure IAM roles and OIDC, install essential add-ons (CoreDNS, kube-proxy, VPC CNI), set up the Cluster Autoscaler, deploy a test application, and finally tear everything down cleanly.
Prerequisites
Before starting, ensure you have the following in place:
- An AWS account with billing enabled
- An IAM user or role with permissions for EKS, EC2, VPC, IAM, CloudWatch Logs, and KMS (AdministratorAccess works for testing, but scope down for production)
- AWS CLI v2 installed and configured with credentials
- Terraform v1.6+ or OpenTofu v1.6+ installed on your workstation
kubectlv1.28+ installedgitfor version control of your Terraform files- A Linux or macOS workstation (Windows with WSL2 works too)
Verify your AWS CLI is configured correctly by running:
$ aws sts get-caller-identity
You should see output showing your AWS account ID, user ARN, and user ID. If you get an error, run aws configure and provide your Access Key ID, Secret Access Key, and preferred region.
Confirm Terraform is installed:
$ terraform version
Terraform v1.9.x
on linux_amd64
If you prefer OpenTofu, replace terraform with tofu in all commands throughout this guide. The configuration syntax is identical.
Step 1: Create the Project Structure
Start by creating a directory for your EKS Terraform project. We will organize the configuration into separate files for readability.
$ mkdir eks-terraform && cd eks-terraform
$ touch main.tf variables.tf outputs.tf providers.tf terraform.tfvars
This gives us a clean structure where providers.tf handles provider configuration, variables.tf defines input variables, main.tf holds the core resources, outputs.tf defines what gets printed after apply, and terraform.tfvars sets variable values.
Step 2: Configure the Terraform AWS Provider
Open the providers file and add the required provider configuration.
$ vim providers.tf
Add the following content:
terraform {
required_version = ">= 1.6.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.80"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.35"
}
tls = {
source = "hashicorp/tls"
version = "~> 4.0"
}
}
}
provider "aws" {
region = var.aws_region
default_tags {
tags = {
Environment = var.environment
ManagedBy = "terraform"
Project = "eks-cluster"
}
}
}
provider "kubernetes" {
host = module.eks.cluster_endpoint
cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
}
}
The AWS provider version ~> 5.80 pins to the 5.x line while allowing patch updates. The Kubernetes provider is configured to authenticate through the AWS CLI, which is the recommended approach for EKS.
Step 3: Define Input Variables
Open the variables file and define all configurable parameters for the deployment.
$ vim variables.tf
Add these variable definitions:
variable "aws_region" {
description = "AWS region for the EKS cluster"
type = string
default = "us-east-1"
}
variable "environment" {
description = "Environment name (dev, staging, production)"
type = string
default = "dev"
}
variable "cluster_name" {
description = "Name of the EKS cluster"
type = string
default = "my-eks-cluster"
}
variable "cluster_version" {
description = "Kubernetes version for the EKS cluster"
type = string
default = "1.31"
}
variable "vpc_cidr" {
description = "CIDR block for the VPC"
type = string
default = "10.0.0.0/16"
}
variable "node_instance_types" {
description = "EC2 instance types for the managed node group"
type = list(string)
default = ["t3.medium"]
}
variable "node_desired_size" {
description = "Desired number of worker nodes"
type = number
default = 2
}
variable "node_min_size" {
description = "Minimum number of worker nodes"
type = number
default = 1
}
variable "node_max_size" {
description = "Maximum number of worker nodes"
type = number
default = 5
}
variable "node_disk_size" {
description = "Disk size in GB for worker nodes"
type = number
default = 50
}
Now set the actual values in the tfvars file:
$ vim terraform.tfvars
Add your preferred settings:
aws_region = "us-east-1"
environment = "dev"
cluster_name = "my-eks-cluster"
cluster_version = "1.31"
vpc_cidr = "10.0.0.0/16"
node_instance_types = ["t3.medium"]
node_desired_size = 2
node_min_size = 1
node_max_size = 5
node_disk_size = 50
Adjust the region, instance type, and node count to match your workload requirements. For production, consider using m5.large or larger instances with at least 3 nodes across multiple availability zones.
Step 4: Deploy the VPC with the Terraform AWS VPC Module
EKS requires a VPC with specific subnet tagging for load balancer integration and pod networking. The official terraform-aws-modules/vpc module handles this cleanly. Open the main configuration file.
$ vim main.tf
Start with the data source to get available AZs, then add the VPC module:
# Fetch availability zones in the selected region
data "aws_availability_zones" "available" {
filter {
name = "opt-in-status"
values = ["opt-in-not-required"]
}
}
locals {
azs = slice(data.aws_availability_zones.available.names, 0, 3)
}
# ------------------------------------------------------------------
# VPC - Public and Private subnets across 3 AZs
# ------------------------------------------------------------------
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.16"
name = "${var.cluster_name}-vpc"
cidr = var.vpc_cidr
azs = local.azs
private_subnets = [for k, v in local.azs : cidrsubnet(var.vpc_cidr, 4, k)]
public_subnets = [for k, v in local.azs : cidrsubnet(var.vpc_cidr, 8, k + 48)]
intra_subnets = [for k, v in local.azs : cidrsubnet(var.vpc_cidr, 8, k + 52)]
enable_nat_gateway = true
single_nat_gateway = true # Set to false for HA in production
enable_dns_hostnames = true
enable_dns_support = true
# Tags required for EKS subnet auto-discovery
public_subnet_tags = {
"kubernetes.io/role/elb" = 1
"kubernetes.io/cluster/${var.cluster_name}" = "owned"
}
private_subnet_tags = {
"kubernetes.io/role/internal-elb" = 1
"kubernetes.io/cluster/${var.cluster_name}" = "owned"
}
tags = {
Environment = var.environment
}
}
This creates a VPC with three private subnets (for worker nodes), three public subnets (for load balancers and NAT gateway), and three intra subnets (for the EKS control plane ENIs). The subnet tags are required so that the AWS Load Balancer Controller can automatically discover which subnets to place load balancers in. The single_nat_gateway = true setting saves cost in development – flip it to false for production to get one NAT gateway per AZ.
Step 5: Deploy the EKS Cluster with Terraform EKS Module
Add the EKS module configuration below the VPC block in main.tf. This module handles the cluster creation, IAM roles, OIDC provider, managed node groups, and add-ons in one block.
# ------------------------------------------------------------------
# EKS Cluster
# ------------------------------------------------------------------
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.31"
cluster_name = var.cluster_name
cluster_version = var.cluster_version
# Networking
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
control_plane_subnet_ids = module.vpc.intra_subnets
cluster_endpoint_public_access = true
# Cluster access configuration
enable_cluster_creator_admin_permissions = true
# EKS Add-ons
cluster_addons = {
coredns = {
most_recent = true
configuration_values = jsonencode({
computeType = "ec2"
})
}
kube-proxy = {
most_recent = true
}
vpc-cni = {
most_recent = true
before_compute = true
configuration_values = jsonencode({
env = {
ENABLE_PREFIX_DELEGATION = "true"
WARM_PREFIX_TARGET = "1"
}
})
}
eks-pod-identity-agent = {
most_recent = true
}
}
# Managed Node Group
eks_managed_node_groups = {
default = {
name = "${var.cluster_name}-ng"
instance_types = var.node_instance_types
capacity_type = "ON_DEMAND"
min_size = var.node_min_size
max_size = var.node_max_size
desired_size = var.node_desired_size
disk_size = var.node_disk_size
# Use Amazon Linux 2023 AMI
ami_type = "AL2023_x86_64_STANDARD"
labels = {
Environment = var.environment
NodeGroup = "default"
}
tags = {
"k8s.io/cluster-autoscaler/enabled" = "true"
"k8s.io/cluster-autoscaler/${var.cluster_name}" = "owned"
}
}
}
tags = {
Environment = var.environment
}
}
Key decisions in this configuration:
- EKS Add-ons managed by Terraform – CoreDNS, kube-proxy, and VPC CNI are installed as EKS managed add-ons rather than self-managed. This means AWS handles version compatibility and updates.
- VPC CNI prefix delegation – The
ENABLE_PREFIX_DELEGATIONsetting assigns /28 prefixes instead of individual IPs to ENIs, significantly increasing the number of pods each node can run. - EKS Pod Identity Agent – This is the newer replacement for IRSA (IAM Roles for Service Accounts) and simplifies how pods get AWS IAM permissions.
- Amazon Linux 2023 – The
AL2023_x86_64_STANDARDAMI type uses the latest Amazon Linux 2023 optimized for EKS. - Cluster Autoscaler tags – The node group is tagged so the Cluster Autoscaler can discover and manage it.
Step 6: Define Outputs
Add outputs so you can easily retrieve cluster connection details after deployment.
$ vim outputs.tf
Add the following output definitions:
output "cluster_name" {
description = "EKS cluster name"
value = module.eks.cluster_name
}
output "cluster_endpoint" {
description = "EKS cluster API endpoint"
value = module.eks.cluster_endpoint
}
output "cluster_version" {
description = "EKS cluster Kubernetes version"
value = module.eks.cluster_version
}
output "cluster_arn" {
description = "EKS cluster ARN"
value = module.eks.cluster_arn
}
output "cluster_certificate_authority_data" {
description = "Base64 encoded certificate data for the cluster"
value = module.eks.cluster_certificate_authority_data
sensitive = true
}
output "oidc_provider_arn" {
description = "ARN of the OIDC provider for IRSA"
value = module.eks.oidc_provider_arn
}
output "node_security_group_id" {
description = "Security group ID attached to the EKS nodes"
value = module.eks.node_security_group_id
}
output "configure_kubectl" {
description = "Command to configure kubectl"
value = "aws eks update-kubeconfig --region ${var.aws_region} --name ${module.eks.cluster_name}"
}
Step 7: Initialize and Deploy the EKS Cluster
With all configuration files in place, initialize Terraform to download the required providers and modules.
$ terraform init
You should see output confirming the providers and modules were downloaded:
Initializing the backend...
Initializing modules...
Downloading registry.terraform.io/terraform-aws-modules/eks/aws 20.31.6 for eks...
- eks in .terraform/modules/eks
Downloading registry.terraform.io/terraform-aws-modules/vpc/aws 5.16.0 for vpc...
- vpc in .terraform/modules/vpc
Initializing provider plugins...
- Installing hashicorp/aws v5.80.0...
- Installing hashicorp/kubernetes v2.35.1...
- Installing hashicorp/tls v4.0.6...
Terraform has been successfully initialized!
Run a plan to review what Terraform will create:
$ terraform plan -out=eks.tfplan
The plan will show around 50-70 resources to be created, including the VPC, subnets, route tables, NAT gateway, EKS cluster, node group, IAM roles, and security groups. Review the plan output carefully, paying attention to the instance types, node counts, and subnet CIDRs.
Apply the plan to create all resources:
$ terraform apply eks.tfplan
The deployment takes 12-20 minutes. The EKS cluster control plane creation alone takes about 10 minutes. Once complete, you will see the outputs showing your cluster endpoint, name, and the kubectl configuration command.
Apply complete! Resources: 62 added, 0 changed, 0 destroyed.
Outputs:
cluster_arn = "arn:aws:eks:us-east-1:123456789012:cluster/my-eks-cluster"
cluster_endpoint = "https://ABCDEF1234567890.gr7.us-east-1.eks.amazonaws.com"
cluster_name = "my-eks-cluster"
cluster_version = "1.31"
configure_kubectl = "aws eks update-kubeconfig --region us-east-1 --name my-eks-cluster"
node_security_group_id = "sg-0abc123def456789"
oidc_provider_arn = "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/ABCDEF1234567890"
Step 8: Configure kubectl and Verify the Cluster
Update your local kubeconfig to connect to the new EKS cluster. Use the command from the Terraform output:
$ aws eks update-kubeconfig --region us-east-1 --name my-eks-cluster
Expected output:
Added new context arn:aws:eks:us-east-1:123456789012:cluster/my-eks-cluster to /home/user/.kube/config
Verify you can reach the cluster API and that nodes are ready:
$ kubectl get nodes
You should see your worker nodes in Ready state:
NAME STATUS ROLES AGE VERSION
ip-10-0-1-45.ec2.internal Ready <none> 3m v1.31.2-eks-7f9249a
ip-10-0-2-78.ec2.internal Ready <none> 3m v1.31.2-eks-7f9249a
Check that all EKS add-ons are running in the kube-system namespace:
$ kubectl get pods -n kube-system
You should see pods for CoreDNS, kube-proxy, VPC CNI (aws-node), and EKS Pod Identity Agent all in Running state:
NAME READY STATUS RESTARTS AGE
aws-node-abcde 2/2 Running 0 5m
aws-node-fghij 2/2 Running 0 5m
coredns-5678abcde-k1l2m 1/1 Running 0 8m
coredns-5678abcde-n3o4p 1/1 Running 0 8m
eks-pod-identity-agent-x1 1/1 Running 0 5m
eks-pod-identity-agent-y2 1/1 Running 0 5m
kube-proxy-q5r6s 1/1 Running 0 5m
kube-proxy-t7u8v 1/1 Running 0 5m
Verify the cluster info:
$ kubectl cluster-info
This confirms that the Kubernetes control plane and CoreDNS are running and reachable.
Step 9: Set Up the Cluster Autoscaler
The Cluster Autoscaler automatically adjusts the number of nodes in your cluster based on pod scheduling demand. It scales up when pods are pending due to insufficient resources and scales down when nodes are underused. We already tagged the node group with the required autoscaler tags in Step 5.
First, create an IAM policy for the Cluster Autoscaler. Add this to your main.tf file:
# ------------------------------------------------------------------
# Cluster Autoscaler IAM
# ------------------------------------------------------------------
module "cluster_autoscaler_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "~> 5.48"
role_name = "${var.cluster_name}-cluster-autoscaler"
attach_cluster_autoscaler_policy = true
cluster_autoscaler_cluster_names = [module.eks.cluster_name]
oidc_providers = {
main = {
provider_arn = module.eks.oidc_provider_arn
namespace_service_accounts = ["kube-system:cluster-autoscaler"]
}
}
tags = {
Environment = var.environment
}
}
Add an output for the autoscaler role ARN in outputs.tf:
output "cluster_autoscaler_role_arn" {
description = "IAM role ARN for the Cluster Autoscaler"
value = module.cluster_autoscaler_irsa.iam_role_arn
}
Apply the changes to create the IAM role:
$ terraform apply -auto-approve
Now deploy the Cluster Autoscaler using kubectl. Create a manifest file:
$ vim cluster-autoscaler.yaml
Add the following Kubernetes manifest (replace AUTOSCALER_ROLE_ARN with the ARN from Terraform output and MY_CLUSTER_NAME with your cluster name):
apiVersion: v1
kind: ServiceAccount
metadata:
name: cluster-autoscaler
namespace: kube-system
annotations:
eks.amazonaws.com/role-arn: AUTOSCALER_ROLE_ARN
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-autoscaler
rules:
- apiGroups: [""]
resources: ["events", "endpoints"]
verbs: ["create", "patch"]
- apiGroups: [""]
resources: ["pods/eviction"]
verbs: ["create"]
- apiGroups: [""]
resources: ["pods/status"]
verbs: ["update"]
- apiGroups: [""]
resources: ["endpoints"]
resourceNames: ["cluster-autoscaler"]
verbs: ["get", "update"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["watch", "list", "get", "update"]
- apiGroups: [""]
resources: ["namespaces", "pods", "services", "replicationcontrollers", "persistentvolumeclaims", "persistentvolumes"]
verbs: ["watch", "list", "get"]
- apiGroups: ["extensions"]
resources: ["replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["policy"]
resources: ["poddisruptionbudgets"]
verbs: ["watch", "list"]
- apiGroups: ["apps"]
resources: ["statefulsets", "replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses", "csinodes", "csidrivers", "csistoragecapacities"]
verbs: ["watch", "list", "get"]
- apiGroups: ["batch", "extensions"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "patch"]
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["create"]
- apiGroups: ["coordination.k8s.io"]
resourceNames: ["cluster-autoscaler"]
resources: ["leases"]
verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-autoscaler
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-autoscaler
subjects:
- kind: ServiceAccount
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
app: cluster-autoscaler
spec:
replicas: 1
selector:
matchLabels:
app: cluster-autoscaler
template:
metadata:
labels:
app: cluster-autoscaler
spec:
serviceAccountName: cluster-autoscaler
priorityClassName: system-cluster-critical
securityContext:
runAsNonRoot: true
runAsUser: 65534
fsGroup: 65534
seccompProfile:
type: RuntimeDefault
containers:
- name: cluster-autoscaler
image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.31.0
command:
- ./cluster-autoscaler
- --v=4
- --stderrthreshold=info
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/MY_CLUSTER_NAME
- --balance-similar-node-groups
- --skip-nodes-with-system-pods=false
resources:
limits:
cpu: 100m
memory: 600Mi
requests:
cpu: 100m
memory: 600Mi
Apply the autoscaler manifest:
$ kubectl apply -f cluster-autoscaler.yaml
Verify the autoscaler pod is running:
$ kubectl get pods -n kube-system -l app=cluster-autoscaler
NAME READY STATUS RESTARTS AGE
cluster-autoscaler-6b4f5c8d9f-xk2mn 1/1 Running 0 30s
Check the logs to confirm it discovered your Auto Scaling Group:
$ kubectl logs -n kube-system -l app=cluster-autoscaler --tail=20
You should see log lines showing the autoscaler found your node group ASG and is monitoring it for scaling events.
Step 10: Deploy a Test Application
Deploy a sample nginx application to verify the cluster is fully operational with networking, DNS resolution, and load balancing.
$ vim test-app.yaml
Add the following deployment and service definition:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-test
namespace: default
labels:
app: nginx-test
spec:
replicas: 3
selector:
matchLabels:
app: nginx-test
template:
metadata:
labels:
app: nginx-test
spec:
containers:
- name: nginx
image: nginx:stable-alpine
ports:
- containerPort: 80
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 100m
memory: 128Mi
---
apiVersion: v1
kind: Service
metadata:
name: nginx-test
namespace: default
spec:
type: LoadBalancer
selector:
app: nginx-test
ports:
- protocol: TCP
port: 80
targetPort: 80
Apply the test application:
$ kubectl apply -f test-app.yaml
Wait for the pods to start and the load balancer to provision:
$ kubectl get pods -l app=nginx-test
NAME READY STATUS RESTARTS AGE
nginx-test-7d5b8f6c9f-abc12 1/1 Running 0 45s
nginx-test-7d5b8f6c9f-def34 1/1 Running 0 45s
nginx-test-7d5b8f6c9f-ghi56 1/1 Running 0 45s
Check the service to get the load balancer URL:
$ kubectl get svc nginx-test
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx-test LoadBalancer 172.20.45.123 a1b2c3d4e5f6-1234567890.us-east-1.elb.amazonaws.com 80:31234/TCP 2m
The EXTERNAL-IP column shows the AWS Classic Load Balancer DNS name. It can take 2-3 minutes for the load balancer to become active. Test it with curl:
$ curl -s http://a1b2c3d4e5f6-1234567890.us-east-1.elb.amazonaws.com | head -5
You should see the default nginx welcome page HTML, confirming that networking, DNS, and load balancing are all working end to end.
Clean up the test application once verified:
$ kubectl delete -f test-app.yaml
Understanding the IAM Roles Created by the EKS Module
The EKS Terraform module creates several IAM roles automatically. It helps to understand what each one does:
- Cluster IAM Role – Allows the EKS service to manage AWS resources on your behalf. It has the
AmazonEKSClusterPolicyattached. - Node Group IAM Role – Assigned to EC2 instances in the managed node group. It has
AmazonEKSWorkerNodePolicy,AmazonEKS_CNI_Policy, andAmazonEC2ContainerRegistryReadOnlyattached. - OIDC Provider – An OpenID Connect identity provider that enables Kubernetes service accounts to assume IAM roles (IRSA). This is how the Cluster Autoscaler gets its AWS permissions without using access keys.
You can view the created roles in the AWS IAM console or CLI:
$ aws iam list-roles --query "Roles[?contains(RoleName, 'my-eks-cluster')].[RoleName,Arn]" --output table
Working with EKS Add-ons
The three core EKS add-ons we configured in Step 5 are essential for cluster operation:
CoreDNS provides DNS resolution inside the cluster. Every pod uses CoreDNS to resolve service names like my-service.default.svc.cluster.local. The EKS managed add-on keeps CoreDNS updated and compatible with your cluster version.
kube-proxy maintains network rules on each node that enable Service-based networking. It handles routing traffic from a Service’s ClusterIP to the backing pods.
VPC CNI (aws-node) is the networking plugin that assigns real VPC IP addresses to pods. With prefix delegation enabled, each node can support many more pods because it assigns /28 CIDR blocks instead of individual IPs.
Check the installed add-on versions at any time:
$ aws eks describe-addon --cluster-name my-eks-cluster --addon-name vpc-cni --query "addon.addonVersion" --output text
$ aws eks describe-addon --cluster-name my-eks-cluster --addon-name coredns --query "addon.addonVersion" --output text
$ aws eks describe-addon --cluster-name my-eks-cluster --addon-name kube-proxy --query "addon.addonVersion" --output text
To update add-ons, simply change the Terraform configuration (or keep most_recent = true) and run terraform apply.
Adding a Spot Instance Node Group (Optional)
For cost savings on non-critical workloads, you can add a Spot instance node group alongside the On-Demand group. Add this block inside the eks_managed_node_groups map in main.tf:
spot = {
name = "${var.cluster_name}-spot-ng"
instance_types = ["t3.medium", "t3.large", "t3a.medium", "t3a.large"]
capacity_type = "SPOT"
min_size = 0
max_size = 10
desired_size = 2
disk_size = 50
ami_type = "AL2023_x86_64_STANDARD"
labels = {
Environment = var.environment
NodeGroup = "spot"
CapacityType = "spot"
}
taints = [{
key = "spot"
value = "true"
effect = "NO_SCHEDULE"
}]
tags = {
"k8s.io/cluster-autoscaler/enabled" = "true"
"k8s.io/cluster-autoscaler/${var.cluster_name}" = "owned"
}
}
The taint on the Spot node group prevents regular pods from being scheduled there unless they have a matching toleration. This way, only workloads that explicitly opt in to Spot instances will run on cheaper, interruptible nodes.
Enabling Cluster Logging
EKS can send control plane logs to CloudWatch Logs. This is useful for debugging authentication issues, API audit trails, and scheduler decisions. Add this parameter to the module "eks" block:
cluster_enabled_log_types = ["api", "audit", "authenticator", "controllerManager", "scheduler"]
After applying, logs will appear in CloudWatch under the log group /aws/eks/my-eks-cluster/cluster. Be aware that control plane logging adds cost, so enable only the log types you need in production.
Using a Remote Backend for Terraform State
For team environments, store Terraform state in a remote backend instead of the local filesystem. S3 with DynamoDB locking is the most common approach for AWS. Add this inside the terraform {} block in providers.tf:
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "eks/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks"
encrypt = true
}
Create the S3 bucket and DynamoDB table before initializing Terraform with this backend. The DynamoDB table prevents concurrent state modifications that could corrupt your infrastructure state.
Cleanup – Destroy the EKS Cluster
When you no longer need the cluster, destroy all resources to stop incurring AWS charges. First, make sure you have deleted any Kubernetes resources that created AWS infrastructure (like LoadBalancer services or EBS-backed PersistentVolumes), as Terraform does not know about those.
$ kubectl delete svc --all-namespaces -l type=LoadBalancer
$ kubectl delete pvc --all
Wait a minute for AWS to delete the associated load balancers and EBS volumes, then run:
$ terraform destroy
Terraform will show the list of resources to be destroyed. Type yes to confirm. The destroy process takes about 10-15 minutes. After completion, verify in the AWS console that the VPC, EKS cluster, and all associated resources have been removed.
$ aws eks list-clusters --region us-east-1
{
"clusters": []
}
If the destroy gets stuck (usually on ENI or security group deletion), wait a few minutes and retry. EKS sometimes takes time to fully release network interfaces.
Troubleshooting Common Issues
Here are the most frequent issues when deploying EKS with Terraform and how to resolve them:
Nodes not joining the cluster – Check that the node group IAM role has the required policies attached. Run aws eks describe-nodegroup to check the node group status and any health issues.
kubectl connection refused – Make sure your kubeconfig is updated with aws eks update-kubeconfig and that the IAM user running kubectl is the same one that created the cluster (or has been granted access).
Pods stuck in Pending – Check if there are enough nodes and resources. Run kubectl describe pod <pod-name> to see scheduling events. This is where the Cluster Autoscaler helps by adding nodes automatically.
Terraform destroy fails on VPC – This usually means there are still ENIs or load balancers attached to the VPC subnets. Delete any remaining Kubernetes services of type LoadBalancer, wait for the ELBs to be removed, then retry the destroy.
CoreDNS pods in CrashLoopBackOff – This often happens when the VPC CNI is not ready yet. The before_compute = true setting on the VPC CNI add-on ensures it is installed before nodes join, which prevents this race condition.
Conclusion
You now have a fully functional Amazon EKS cluster deployed and managed through Terraform. The setup includes a multi-AZ VPC with proper subnet tagging, managed node groups running Amazon Linux 2023, EKS-managed add-ons for networking and DNS, IAM roles with least-privilege access through OIDC/IRSA, and the Cluster Autoscaler for dynamic node scaling.
For production deployments, consider enabling cluster logging, switching to multiple NAT gateways for high availability, adding node group encryption with KMS, implementing network policies with Calico, and setting up monitoring with Prometheus and Grafana. Store your Terraform state in S3 with DynamoDB locking, and keep all configuration in version control.
Related Guides
- How To Install Terraform on Linux Systems
- How To Install and Use AWS CLI on Linux
- Deploy VM Instances on Hetzner Cloud with Terraform
- How To Store Terraform State in Consul KV Store
- Create AWS IAM Users and Groups with AWS CLI





































































