π Which instance type to choose for AWS Elastic Kubernetes Service(EKS) workload On-demand vs. Spot instance π
π To read the complete blog https://www.101daysofdevops.com/courses/100-days-of-aws/lessons/day-48/
π To view the complete course https://lnkd.in/gjeGAPd2
β‘οΈ You can contact me via https://lnkd.in/dePjvNDw
πββοΈThis simple answer to this question is if your application can handle the interrupt, then choose Spot Instance else, go with on-demand, and that still holds true. But I did a small experiment by spinning up 100 pods using Elastic Kubernetes Service(EKS) first via on-demand and then using spot, and the result was somewhat surprising.
β Why spot instance?
π° Spot instances are 70β90% cheaper as compared to on-demand instance
β Why donβt you choose spot instance?
π If AWS needs a capacity back there will give you 2 min interruption notice and terminate the instance.
π§ͺ The experiment I performed by spinning up 100 pods using Elastic Kubernetes Service(EKS) first via on-demand and then using spot
π¬ The test I ran is pretty basic, with a sample size of 100 pods and a simple Kubernetes manifest, which may not be enough, but the point I am trying to drive here is that I got the spot capacity pretty quickly. In future, I will try with a greater number of pods, but I need to be mindful of π°π΅ π³
π§ So, with 70β90% lesser cost, I can run the same amount of pods. But the question remains the same what if my workload got interrupted? If I find out the nice handy-dandy tool called eventbridge-cli through which you can capture spot interruption notices https://github.com/spezam/eventbridge-cli
> eventbridge-cli -j -e '{"source":["aws.ec2"], "detail-type": ["EC2 Spot Instance Interruption Warning"]}'
2022/11/25 16:41:55 creating eventBridge client for bus [default]
2022/11/25 16:41:55 creating temporary rule on bus [default]: {"source":["aws.ec2"], "detail-type": ["EC2 Spot Instance Interruption Warning"]}
2022/11/25 16:41:56 created temporary rule on bus [default] with arn: arn:aws:events:us-west-2:XXXXXXXXXX:rule/eventbridge-cli-6a0a268d-a94f-4ce4-9ffd-b3e1bf84c80f
2022/11/25 16:41:56 created temporary SQS queue with URL: https://sqs.us-west-2.amazonaws.com/XXXXXXXXXX/eventbridge-cli-6a0a268d-a94f-4ce4-9ffd-b3e1bf84c80f
2022/11/25 16:41:56 linked EventBus --> SQS...
2022/11/25 16:41:56 polling queue https://sqs.us-west-2.amazonaws.com/XXXXXXXXXX/eventbridge-cli-6a0a268d-a94f-4ce4-9ffd-b3e1bf84c80f ...
2022/11/25 16:41:56 press ctr+c to stop
βAlso, in the case of Kubernetes, if you want to determine which of your worker node is spot vs. on-demand, run the below command
> kubectl get nodes -L node.kubernetes.io/instance-type,kubernetes.io/arch,eks.amazonaws.com/capacityType,karpenter.sh/capacity-type
π₯ If you are interested to know what is the recommended instance types based on resource criteria like vcpus and memory, you can use the tool ec2-instance-selector https://github.com/aws/amazon-ec2-instance-selector
> ./ec2-instance-selector --base-instance-type m5.large -z us-west-2
m1.large
m3.large
m4.large
m5.large
m5a.large
m5ad.large
m5d.large
m5dn.large
m5n.large
m5zn.large
m6a.large
m6i.large
m6id.large
t2.large
t3.large
t3a.large
βSo still, there are some open questions
β΅ Do bidding still exist in spot instance world π
β‘ Is the concept of overbid still exist, i.e., if someone bids 1 cent more, then my current bid will aws interrupt my instance
β· What are the chances of getting an old vs. a new instance type? I believe the old instance type is readily available compared to the new one.
πββοΈ There is no right or wrong answer to these questions, but the reason I am asking these questions is that I have been running eventbridge-cli for close to a week now to catch spot interruption, and none of my spot instances got interrupted(maybe I am lucky πΊ) but still can someone share there experience