When to use eksctl, aws-cdk, aws cli, CloudFormation and Terraform to set up and manage your EKS clusters on AWS?

luis arcega
20 min readJan 4, 2021

Everyone will agree that working with EKS in AWS could get complicated at times, but gets more complicated when you have to maintain multiple EKS cluster or customers, do repetitive tasks, and need to maintain certain level of standards among your clusters/customers. After getting involved in the AWS container space, especially with EKS, speaking with multiple MSPs (Manage Service Providers), and doing a bunch of workshops / hands-ons, I noticed that I couldn’t find a simple documentation of all the tools we have at our disposal to configure / deploy EKS clusters on AWS. So, after doing some research, I decided to create this comparison table to explain to my clients when and what tools they can use to configure / create KS clusters and I hope this comparison table help YOU as well.

Remember this table is only to maintain EKS clusters at AWS as infrastructure, not to manage containers on Kubernetes.

Diving in and comparing the tools

The first thing we are going to compare is, how we can create an EKS cluster with each tool. Here, I’ll tell you the pre-requisites (if there is any) before executing the command and some considerations when executing them. I won’t explain in detail the parameters because that is not the point, what I want to show here is how easy or difficult could a tool be depending on the goal you want to achieve. That said, let’s get started.

Creating a EKS cluster

With “EKSCTL”

This tool is really simple, elegant and best thing is that you don’t need to be and expert on AWS. To create a cluster, you just need to execute the following command within your terminal

$ eksctl create cluster \
--name <my-cluster> \
--version <1.17> \
--region <us-west-2> \
--nodegroup-name <linux-nodes> \
--nodes <3> \
--nodes-min <1> \
--nodes-max <4> \
--with-oidc \
--ssh-access \
--ssh-public-key <name-of-ec2-keypair> \
--managed
# Or just simple$ eksctl create cluster --config-file=<path>

That simple command will create:

  • VPC with at least 4 subnets in 2 zones along with, ACLs and SGs, IAM roles and instance profiles
  • It’ll create the EKS cluster with adefault Managed Node Group (EC2 insances), join the ec2 instances to the EKS cluster

Once the cluster has been created, eksctl will create the Kubeconfig file locally where you ran the command, so you can connect to the cluster.

If you have multiple EKS clusters and you need to switch between them, I would recommend to use “kubectx”. With kubectx, you can switch between Kubernetes contexts very easily as follows:

Kubectx <EKS-Cluster-Name>   : switch to context <NAME>

With “AWS CLI”

Before using the CLI, bear in mind that you really need to know how AWS core services work and also know there are prerequisites you need to have created, so you can create the EKS cluster using AWS CLI.

Prerequisites:

After you have all the prerequisites, you can now create the EKS cluster control plane:

$ aws eks create-cluster \
--region
<region-code> \
--name
<my-cluster> \
--kubernetes-version
<1.17> \
--role-arn
<IAM-arn> \
--resources-vpc-config subnetIds
=<subnet-id>,<subnet-id2>,securityGroupIds=<sg-id>

Note that the previous command won’t create the node-group/worker nodes (ec2 instances), it will create only the cluster control plane, so you can then create your node group (ec2 instances) and join the ec2 instances to the cluster. To create the nodes, you can create a managed node-group as follows:

aws eks create-nodegroup --cluster-name <value> --nodegroup-name <value> --subnets <value> --node-role <value> --region <region>

Now, to get access to the cluster and start working with kubectl, you’ll need to download the configuration as follow:

aws eks --region <region-code> update-kubeconfig --name <cluster_name>

With “CloudFormation”

When working with CloudFormation is pretty similar as of the CLI in regard to the prerequisites you need to have, the only difference is that you need to put it in a configuration file (CloudFormation template). After you had the prerequisites you can then use the next CF snippet to create the cluster control plane.

CloudFormation yaml snippet:

Resources:
myCluster:
Type: 'AWS::EKS::Cluster'
Properties:
Name: prod
Version: '1.14'
RoleArn: >-
arn:aws:iam::012345678910:role/eks-service-role-AWSServiceRoleForAmazonEKS-EXAMPLEBQ4PI
ResourcesVpcConfig:
SecurityGroupIds:
- sg-6979fe18
SubnetIds:
- subnet-6782e71e
- subnet-e7e761ac

And you will also need to create the node-groups, and for that you need this snipped within your CF template.

Resources:
EKSNodegroup:
Type: 'AWS::EKS::Nodegroup'
Properties:
ClusterName: prod
NodeRole: 'arn:aws:iam::012345678910:role/eksInstanceRole'
ScalingConfig:
MinSize: 3
DesiredSize: 5
MaxSize: 7
Labels:
Key1: Value1
Key2: Value2
Subnets:
- subnet-6782e71e
- subnet-e7e761ac

Let’s no re-invent the wheel; There is a CF template that will help you to create the VPC and the manage node groups and all the resources necessary for the EKS cluster to work properly, of course you will need to modify it to fit your needs.

With “AWS CDK”

AWS CDK will help you with best practices and also reduce the complexity compared with CF. For this blog I’ll be use CDK for Python, but remember that the CDK works with multiple languages, so you can choose the one that suits you the best.

You can create your VPC with default configuration (Implicitly created using ec2.Vpc) or let CDK creates one for you during the deployment of the EKS, and for that you will need to use the following code:

const cluster = new eks.Cluster(this, 'hello-eks', version=eks.KubernetesVersion.V1_17);# Ornew eks.Cluster(this, 'HelloEKS', version=eks.KubernetesVersion.V1_17, default_capacity= 5, default_capacity_instance= ec2.InstanceType.of(ec2.InstanceClass.M5, ec2.InstanceSize.SMALL));

That code, will create the VPC, SGs, Subnets, IAM roles, Service accounts and a bunch of configurations that are needed for the cluster, same as the eksctl tool does it.

After you have created the cluster, in order to interact with your cluster through kubectl, you can use the aws eks update-kubeconfig aws cli command to configure your local kubeconfig. The CDK EKS module will define a CloudFormation output in your stack which contains the command to run, and also it will print it out into your terminal as follows:

Outputs:
ClusterConfigCommand43AAE40F = aws eks update-kubeconfig --name cluster-xxxxx --role-arn arn:aws:iam::112233445566:role/yyyyy

Grab the output that is in bold and execute the aws eks update-kubeconfig ... command in your terminal to create or update a local kubeconfig context as follows:

$ aws eks update-kubeconfig --name cluster-xxxxx --role-arn arn:aws:iam::<accountId>:role/<role-name># Output
# Added new context arn:aws:eks:rrrrr:112233445566:cluster/cluster-xxxxx to /home/boom/.kube/config

With “TERRAFORM”

Terraform works similar to CloudFormation templates in the sense that you need to create multiple resources in a declarative way before you can create the EKS cluster.

Prerequisites:

  1. Install terraform of course, to do that just follow this Installation guide.
  2. Create a VPC, for that you can use this vpc.tf to provision it. This tf template will create the subnets and availability zones using the AWS VPC Module.
  3. Create the necessary SGs withsecurity-groups.tf,used by the EKS cluster.

Now, let’s create the EKS cluster with terraform. Once you have your VPC and security groups created we can declarative create the EKS as follow:

terraform snippet:

module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = local.cluster_name
cluster_version = "1.17"

#VPC configuration
vpc_id = module.vpc.vpc_id
subnets = module.vpc.private_subnets

tags = {
Environment = "training"
GithubRepo = "terraform-aws-eks"
GithubOrg = "terraform-aws-modules"
}
worker_groups = [
{
name = "worker-group-1"
instance_type = "t2.small"
additional_userdata = "echo foo bar"
asg_desired_capacity = 2
additional_security_group_ids = [aws_security_group.worker_group_mgmt_one.id]
},
{
name = "worker-group-2"
instance_type = "t2.medium"
additional_userdata = "echo foo bar"
additional_security_group_ids = [aws_security_group.worker_group_mgmt_two.id]
asg_desired_capacity = 1
}]
}

Now let’s dissect the components of the module a bit:

  1. First a module is a container for multiple resources that are used together. In this case our module was declared as “eks” which is how we are going to reference/identify it within terraform.
  2. The source argument is mandatory for all modules.
  3. The cluster_name variable is where you specify the eks cluster name.
  4. The cluster_version is the Kubernetes version you want to use for your EKS cluster.
  5. The tags are the normal tags for the eks cluster in AWS.
  6. The vpc_id is where we set the reference to the VPC what was created before the EKS cluster could be created.

Something important to mention here is that terraform allows you create/manage dependencies between resources, event if you don’t declare it within the file, Terraform will infer the dependencies between the resources on your behalf. That said, the VPC needs to be declared and then you will need to do the reference to it from the eks module via the VPC id and the same applies for the subnets, and terraform will infer that it first need to create the VPC and then the EKS cluster.

  1. worker_groups section is where we declare the node-groups that are going to be created and joined to the EKS cluster

To get the output information related to the EKS cluster you can use terraform outputs like this:

output   "cluster_id" {
description = "EKS cluster ID."
value = module.eks.cluster_id
}
output "cluster_endpoint" {
description = "Endpoint for EKS control plane."
value = module.eks.cluster_endpoint
}
output "cluster_security_group_id" {
description = "Security group ids attached to the cluster control plane."
value = module.eks.cluster_security_group_id
}
output "kubectl_config" {
description = "kubectl config as generated by the module."
value = module.eks.kubeconfig
}
output "config_map_aws_auth" {
description = "A kubernetes configuration to authenticate to this EKS cluster."
value = module.eks.config_map_aws_auth
}
output "region" {
description = "AWS region"
value = var.region
}
output "cluster_name" {
description = "Kubernetes Cluster Name"
value = local.cluster_name
}

If you have noticed, the output kubectl_config will does not include the certs that you need to get access to the EKS cluster via the kubectl. So, the way you need to generate the Kubernetes config file is using the aws cli or the eksctl in conjunction with the terraform output like:

$ eksctl utils write-kubeconfig  --name $(terraform output cluster_name | jq -r .) --region $(terraform output region | jq -r .)# or$ aws eks --region $(terraform output region | jq -r .) update-kubeconfig --name $(terraform output cluster_name | jq -r .)

Note: The region isn’t necessary if you deployed the EKS cluster within the same region you have configured for your AWS CLI.

Updating a EKS cluster

Now, let’s compare the tools doing something where a lot of people struggle and where in my opinion is one of the most tedious and routinely operations we need to do on Kubernetes “Updating the Cluster”. Even when EKS is manage by AWS, EKS share the responsibility with end users to update the cluster. I said, it is a shared responsibility because AWS will post the control plane update and make it available for you, but you are responsible to actually apply the update to the control plane and to the EC2 instances in the EKS cluster.

Now for the people that does not know the manual process to update a Kubernetes cluster, here are the steps in general:

1. etcd (all instances)
2. kube-apiserver (all control plane hosts)
3. kube-controller-manager
4. kube-scheduler
5. cloud controller manager, if you use one
  • Upgrade the nodes in your cluster: For each node in your cluster, drain that node and then either replace it with a new node that uses the 1.20 kubelet, or upgrade the kubelet on that node and bring the node back into service
  • Upgrade clients such as kubectl
  • Adjust manifests and other resources based on the API changes that accompany the new Kubernetes versions. Upgrading to a new Kubernetes version can provide new APIs. You can use kubectl convert command to convert manifests between different API versions. For example: kubectl convert -f pod.yaml — output-version v1

Now the EKS process steps:

To be honest the steps are pretty similar, with the exception that AWS will help with some of the parts. That said, AWS has some requirements and considerations that you must have in mind:

  • To upgrade the cluster, Amazon EKS requires 2 to 3 free IP addresses from the subnets that were provided when you created the cluster. If these subnets don’t have available IP addresses, then the upgrade can fail.
  • Additionally, if any of the subnets or security groups that were provided during cluster creation have been deleted, the cluster upgrade process can fail.
  • Even though Amazon EKS runs a highly available control plane, you might experience minor service interruptions during an update. For example, if you attempt to connect to an API server just before or just after it’s terminated and replaced by a new API server running the new version of Kubernetes, you might experience API call errors or connectivity issues. If this happens, retry your API operations until they succeed.
  • Amazon EKS doesn’t modify any of your Kubernetes add-ons when you update a cluster. After updating your cluster, AWS recommends that you update the add-ons listed below according to the Kubernetes version that you’re updating to (For more information regarding the add-ons version, please check out this link).
o   Amazon VPC CNI plug-in
o DNS (CoreDNS)
o KubeProxy
o Any other add-ons

Upgrading the Control Plane

First some caveats:

  • AWS recommend that you update your nodes to your cluster’s current pre-update Kubernetes minor version prior to your cluster update. Your nodes must not run a newer Kubernetes version than your control plane. For example, if your control plane is running version 1.17 and your nodes are running version 1.15, update your nodes to version 1.16 or 1.17 (recommended) before you update your cluster’s Kubernetes version to 1.18. For more information, see Self-managed node updates.
  • Because Amazon EKS runs a highly available control plane, you can update only one minor version at a time. See Kubernetes Version and Version Skew Support Policy for the rationale behind this requirement. Therefore, if your current version is 1.16 and you want to upgrade to 1.18, then you must first upgrade your cluster to 1.17 and then upgrade it from 1.17 to 1.18. If you try to update directly from 1.16 to 1.18, then the update version command throws an error.

Pre-Verifications:

  1. The pod security policy admission controller is enabled on Amazon EKS clusters running Kubernetes version 1.13 or later. For that you can use kubectl get psp eks.privileged and if you receive the following error, see To install or restore the default pod security policy before proceeding.
  2. If you originally deployed your cluster on Kubernetes 1.17 or earlier, then you may need to remove a deprecated term from your CoreDNS manifest.
# Check to see if your CoreDNS manifest has the line.
$ kubectl get configmap coredns -n kube-system -o yaml | grep upstream
# If no output is returned, your manifest doesn't have the line and you can skip to the next step to update your cluster. If output is returned, then you need to remove the line as follow:$ kubectl edit configmap coredns -n kube-system -o yaml

Now let’s finally upgrade the EKS control plane with the tools.

Using “EKSCTL”

First some caveats:

  • It is really important to know that eksctl can only upgrade clusters that where created using the tool.
  • This procedure requires eksctl version 0.35.0
  • Upgrading a cluster from 1.16 to 1.17 will fail if any of your AWS Fargate pods have a kubelet minor version earlier than 1.16. Before upgrading your cluster from 1.16 to 1.17, you need to recycle your Fargate pods so that their kubelet is 1.16 before attempting to upgrade the cluster to 1.17.

Now, after all those considerations here is the command to upgrade the EKS control plane:

eksctl upgrade cluster --name <dev> --approve

This process will take several minutes to complete.

Using “AWS CLI”

Believe or not, the command it is pretty straight forward as the eksctl. Here the command:

aws eks --region <region-code> update-cluster-version --name <my-cluster> --kubernetes-version <1.18>

Where you just need to set the version you want to upgrade to. Then, to monitor the status of your cluster update with the following command. Use the cluster name and update ID that the previous command returned. Your update is complete when the status appears as Successful.

aws eks --region <region-code> describe-update --name <my-cluster> --update-id <ProcessId>

Using “CloudFormation”

For CloudFormation you just need to update the property “Version” in the CF snippet with the EKS version number you want to upgrade to:

CF json template snippet:

{
"Type" : "AWS::EKS::Cluster",
"Properties" : {
"EncryptionConfig" : [ EncryptionConfig, ... ],
"KubernetesNetworkConfig" : KubernetesNetworkConfig,
"Name" : String,
"ResourcesVpcConfig" : ResourcesVpcConfig,
"RoleArn" : String,
"Version" : String
}
}

AWS “CDK”

This one is also pretty straight forward, you only need to update your code with the version you want to apply. For example if you created your code using the version 1.17 as follows:

const cluster = new cdk_eks.Cluster(this, 'hello-eks', {
version: cdk_eks.KubernetesVersion.V1_17
});

You just need to change the version and apply the changes:

const cluster = new cdk_eks.Cluster(this, 'hello-eks', {
version: cdk_eks.KubernetesVersion.V1_18, #this is where you need to change the version
});

The class cdk_eks.KubernetesVersion maintains a list of the EKS versions you can use.

Using “Terraform”

For terraform the case is similar to the aws-cdk where you just need to update the cluster_version property in your terraform file.

module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = local.cluster_name
cluster_version = "1.18" # Here is where you need to set the version you want to upgrade to
subnets = module.vpc.private_subnets

Update the KubeProxy to the right version:

These steps apply for all the tools.

First, retrieve your current kube-proxy image:

kubectl get daemonset kube-proxy --namespace kube-system -o=jsonpath='{$.spec.template.spec.containers[:1].image}'

Now we can update kube-proxy to the recommended version by taking the output from the previous step and replacing the version tag with your cluster's recommended kube-proxy version:

kubectl set image daemonset.apps/kube-proxy -n kube-system kube-proxy={accountId}.dkr.ecr.{region}.amazonaws.com/eks/kube-proxy:v{1.18.9}-eksbuild.1

Your account Id and region should be changed to the ones you are using and the version must match the table from this link.

eksctl has released a pretty easy way to do it, in case you don’t want to do it via the kubectl:

eksctl utils update-kube-proxy

Update the CoreDNS:

To check if your cluster is already running CoreDNS, use the following command.

kubectl get pod -n kube-system -l k8s-app=kube-dns

If the output shows coredns in the pod names, you're already running CoreDNS in your cluster. If not, see Installing or upgrading CoreDNS to install CoreDNS on your cluster, update it to the recommended version, return here, and skip steps 7-8.

Now to update it:

  • Retrieve your current coredns image with:
kubectl get deployment coredns --namespace kube-system -o=jsonpath='{$.spec.template.spec.containers[:1].image}'
  • Update coredns to the recommended version by taking the output from the previous step and replacing <1.7.0> (including <>) with your cluster's recommended coredns version:
kubectl set image --namespace kube-system deployment.apps/coredns coredns={accountId}.dkr.ecr.{region}..amazonaws.com>/eks/coredns:v{1.7.0}-eksbuild.1

Same as the KubeProxy add-on, eksctl has integrated the update of this add-on in their tool:

eksctl utils update-coredns

Update the VPC CNI PlugIn:

Check the version of your cluster’s Amazon VPC CNI Plugin for Kubernetes. Use the following command to print your cluster’s CNI version.

kubectl describe daemonset aws-node --namespace kube-system | grep Image | cut -d "/" -f 2

If your CNI version is earlier than 1.7.5, then use the appropriate command below to update your CNI version to the latest recommended version:

  • Download the manifest file:
curl -o aws-k8s-cni.yaml https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/v1.7.5/config/v1.7/aws-k8s-cni.yaml
  • Replace <region-code> in the following command with the Region that your cluster is in. Then, run the modified command to replace the Region code in the file (currently us-west-2).
sed -i -e 's/us-west-2//' aws-k8s-cni.yaml
  • Apply the modified manifest file to your cluster.
kubectl apply -f aws-k8s-cni.yaml

Updating the Node Groups:

Now, the final part, which is upgrading the node groups needs to be spited in 2 sections per tool, Managed and unmaged node groups.

EKSCTL | Updating a managed node groups:

(Optional) If you’re using the Kubernetes Cluster Autoscaler, scale the deployment down to zero replicas to avoid conflicting scaling actions.

kubectl scale deployments/cluster-autoscaler --replicas=0 -n kube-system

Upgrade a managed node group to the latest AMI release of the same Kubernetes version that’s currently deployed on the nodes with the following command.

eksctl upgrade nodegroup --name=<node-group-name> --cluster=<cluster-name>
  • If you’re upgrading a node group that’s deployed with a launch template to a new launch template version, add --launch-template-<version> to the preceding command. The launch template must meet the requirements described in Launch template support.
  • If the launch template includes a custom AMI, the AMI must meet the requirements in Using a custom AMI.

EKSCTL | Updating an unmanaged node groups:

The easiest way to update a unmanaged group is creating a new one and then terminating the old one. So, to create a new one is really easy, you just need to execute the following command:

eksctl create nodegroup --cluster=<clusterName>

After the node group is created, you need to delete the previous (old) one as follows:

eksctl delete nodegroup --cluster=<clusterName> --name=<oldNodeGroupName>

The command will drain all pods from the node group before the instances are deleted, that way all the pods will be recreated on the new node group.

AWS-CLI | Updating a managed node groups:

Keep in mind the following considerations:

  • You cannot roll back a node group to an earlier Kubernetes version or AMI version.
  • When a node in a managed node group is terminated due to a scaling action or update, the pods in that node are drained first. Amazon EKS attempts to drain the nodes gracefully and will fail if it is unable to do so. You can force the update if Amazon EKS is unable to drain the nodes as a result of a pod disruption budget issue.

To update the group just execute the following command:

aws eks update-nodegroup-version --cluster-name <value> --nodegroup-name <value>

That would automatically use the latest AMI version, but you can update to the latest AMI version of your cluster’s current Kubernetes version by specifying your cluster’s Kubernetes version in the request by adding the parameter

--kubernetes-version <value>

As follows

aws eks update-nodegroup-version --cluster-name <value> --nodegroup-name <value>

for more information, please check the official documentation here.

AWS-CLI | Updating an unmanaged node groups:

Unfortunately, aws cli does not support doing this for unmanaged node groups

AWS-CDK | Updating a managed node groups:

When creating managed node groups, you just need to upgrade the version to the version you need using the parameter “release_version” which is the AMI version of the Amazon EKS-optimized AMI to use with your node group (for example, 1.14.7-YYYYMMDD). By default: the latest available AMI version for the node group’s current Kubernetes version is used.

e.g

cluster.add_nodegroup_capacity(id='custom-node-group',
nodegroup_name='custom-node-group2',
instance_type=aws_ec2.InstanceType('t3.medium'),
min_size=2,
disk_size=100,
release_version=”1.14.7-YYYYMMDD”
);

After you have done the change it will automatically rolling out the changes tainting and draining the connections

AWS-CDK | Updating an unmanaged node groups:

Unfortunately, the CDK does not support unmanaged node groups

CloudFormation | Updating a managed node groups:

For managed node groups we just need to upgrade the version in the property version as follows:

json snippet:

{
"Type" : "AWS::EKS::Nodegroup",
"Properties" : {
"CapacityType" : String,
"ClusterName" : String,

"Version" : String
}
}

By default, the Kubernetes version of the cluster is used, and this is the only accepted specified value. If you specify launchTemplate, and your launch template uses a custom AMI, then don't specify version, or the node group deployment will fail.

CloudFormation | Updating an unmanaged node groups:

In the other hand when working with un-managed node groups, the solution is pretty similar to eksctl, but for that AWS provide us with an CF template which will help us to create a new un-managed node group and then we can taint and drain the old node group.

Create the new node group, you can use the following template https://s3.us-west-2.amazonaws.com/amazon-eks/cloudformation/2020-10-29/amazon-eks-nodegroup.yaml”. For more info regarding the official CF template, please check this link. Once the node group has been created, taint the nodes:

kubectl taint nodes <node_name> key=value:NoSchedule

And drain them:

kubectl drain <node_name> --ignore-daemonsets --delete-local-data

After your old nodes have finished draining, revoke the security group ingress rules you authorized earlier, and then delete the AWS CloudFormation stack to terminate the instances and edit the aws-auth configmap to remove the old node instance role from RBAC.

kubectl edit configmap -n kube-system aws-auth---apiVersion: v1
data:
mapRoles: |
- rolearn: arn:aws:iam::111122223333:role/nodes-1-18-NodeInstanceRole-W70725MZQFF8
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
- rolearn: <arn:aws:iam::111122223333:role/nodes-1-17-NodeInstanceRole-U11V27W93CX5>
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes>

Save and close the file to apply the updated configmap

Terraform | Updating a managed node groups:

When working with Terraform is a bit more complicated cause we need to accomplish multiple steps like we do for CloudFormation. The first step is to use terraform to search for the newest AMI for your new version of Kubernetes and then create an encrypted copy for your worker nodes. This can be accomplished with the following terraform config:

data "aws_ami" "eks_worker_base_1_13" {
filter {
name = "name"
values = ["amazon-eks-node-1.13*"]
}
most_recent = true# Owner ID of AWS EKS team
owners = ["602401143452"]
}
resource "aws_ami_copy" "eks_worker_1_13" {
name = "${data.aws_ami.eks_worker_base_1_13.name}-encrypted"
description = "Encrypted version of EKS worker AMI"
source_ami_id = "${data.aws_ami.eks_worker_base_1_13.id}"
source_ami_region = "us-east-1"
encrypted = true
tags = {
Name = "${data.aws_ami.eks_worker_base_1_13.name}-encrypted"
}
}

To upgrade our worker nodes to the new version, we will create a new worker group of nodes at the new version, and then move our pods over to them. The first step is to add a new configuration block to your worker_groups configuration in terraform and the field we want to change is ami_id to reference our newly encrypted and copied AMI.

worker_groups = [
{
name = "worker-group-1"
instance_type = "t2.small"
additional_userdata = "echo foo bar"
asg_desired_capacity = 2
additional_security_group_ids = [aws_security_group.worker_group_mgmt_one.id]
},
{
name = "worker-group-2"
instance_type = "t2.medium"
additional_userdata = "echo foo bar"
asg_desired_capacity = 2
additional_security_group_ids = [aws_security_group.worker_group_mgmt_one.id]
ami_id = "${aws_ami_copy.eks_worker_1_13.id}"
},
]

Since Terraform cannot automatically drain the nodes when it terminates them we need to do it manually so we can delete the old node group:

The first step is to use the name of your nodes returned from kubectl get nodes to run kubectl taint nodes on each old node to prevent new pods from being scheduled on them:

kubectl taint nodes ip-10-0-200-78.ec2.internal key=value:NoSchedule
...
kubectl taint nodes ip-10-0-202-81.ec2.internal key=value:NoSchedule

Now we will drain the old nodes and force the pods to move to new nodes. I recommend doing this one node at a time to ensure that everything goes smoothly, especially in a production cluster:

kubectl drain ip-10-0-200-78.ec2.internal --ignore-daemonsets --delete-local-data
...
kubectl drain ip-10-0-202-81.ec2.internal --ignore-daemonsets --delete-local-data

You can check on the progress in-between drain calls to make sure that pods are being scheduled onto the new nodes successful by using kubectl get pods -o wide. If you know you have a sensitive workload, you can individually terminate pods to get them scheduled on the new nodes instead of using kubectl drain.

Once you have confirmed that all non-DaemonSet pods are running on the new nodes, we can terminate your old worker group. Since the eks module uses an ordered array of worker group config objects in the worker_groups key, you cannot just delete the old config. Terraform will see this change and assume that the order must have changed and try to recreate the AutoScaling groups. Instead, we should recognize that this will not be the last time we will do an upgrade, and that empty AutoScaling groups are free. So we will keep the old worker group configuration and just run 0 capacity in it like so:

worker_groups = [
{
name = "worker-group-1"
instance_type = "t2.small"
additional_userdata = "echo foo bar"
asg_desired_capacity = 0
additional_security_group_ids = [aws_security_group.worker_group_mgmt_one.id]
},
{
name = "worker-group-2"
instance_type = "t2.medium"
additional_userdata = "echo foo bar"
asg_desired_capacity = 2
additional_security_group_ids = [aws_security_group.worker_group_mgmt_one.id]
ami_id = "${aws_ami_copy.eks_worker_1_13.id}"
},
]

Now the next time we upgrade, we can put the new config in this worker group and easily spin up workers without worrying about the terraform state. Finally, if you scaled down your cluster-autoscaler, you can revert those changes so that auto scaling works properly again.

Wrapping up!

To finalize, I would like to say that every tool can help you to achieve different things and there is not one tool that can do everything you need, but from my humble opinion if you are just starting in AWS, an easy way for testing or managing a small number of EKS clusters, the best option is to use eksctl. If your plan is to reproduce (templetize) your infrastructure multiple times, add your EKS lifecycle within a CI/CD pipeline, you can either use CF,CDK or Terraform, but if you also intent to manage Kubernetes clusters in multiple clouds, I would suggest to use Terraform, that way you can maintain the same pipeline for all your clouds, almost the same work flow structure and also, everything could be managed within the same tool.

AND TO FINALIZA THIS DOCUMENT, I WOULD LIKE YOU TO REMIND THAT ALL OPINIONS ARE MY OWN.

--

--