100 Days of DevOps — Day 18-Add monitoring to these instances using Terraform(CloudWatch and SNS)
Check the updated 101 Days of DevOps Course
Course Registration link: https://www.101daysofdevops.com/register/
Course Link: https://www.101daysofdevops.com/courses/101-days-of-devops/
YouTube link: https://www.youtube.com/user/laprashant/videos
Welcome to Day 18 of 100 Days of DevOps, Let continue our journey, so far we have discussed fundamentals of terraform, build VPC and EC2 instance using terraform, today let’s add monitoring piece to it
- Adding monitoring piece can be achieved via two ways, we can add separate cloudwatch and SNS module and then call it in our EC2 module
- We can call the Cloudwatch and SNS terraform code directly in EC2 terraform module, I prefer this approach as we want all our EC2 instance comes up with monitoring enabled
- First, let start with SNS topic, which is required to send out a notification via Email, SMS when an event occurs.
* Here I am trying to create an SNS topic resource
* give your SNS topic, some name
* After that I am using a default policy
NOTE: As with SNS, someone needs to confirm the email subscription that why I am using local-exec provisioners with terraform.
Now let’s take a look at terraform code for CloudWatch.
This code is divided into two parts
- Setup CPU Usage Alarm using the Terraform
* Setup an alarm name
* This field is self explanatory,supported operators GreaterThanOrEqualToThreshold, GreaterThanThreshold, LessThanThreshold, LessThanOrEqualToThreshold.
* evaluation_period: The number of periods over which data is compared to the specified threshold(I setup 2 just for demo purpose but its completly depend upon your requirement)
* metric_name: Please check the link below list of services that publish cloudwatch metrics
* namespace: The namespace for the alarm's associated metric(Check the second column of the link below)
* period: period in second(I am using 120 sec or 2min but again it completly depend upon your requirements)
* statistic: The statistic to apply to the alarm's associated metric, supported value: SampleCount, Average, Sum, Minimum, Maximum
* threshold: The value against which the specified statistic is compared.(I set it up as 80% i.e when CPU utilization goes above 80%)
* alarm_actions: The list of actions to execute when this alarm transitions into an ALARM state from any other state. Please note, each action is specified as an Amazon Resource Name (ARN)
* dimensions: The dimensions for the alarm's associated metric. Again check the below mentioned link for supported dimensions
AWS Services That Publish CloudWatch Metrics
- So what this is doing this code is going to send an email using SNS notification when CPU Utilization is more than 80%.
- Now look at the other part is to perform system and instance failure and send an email using SNS notification
* Most of this code is almost similar, only difference is metric _name here is StatusCheckFailed
Final EC2 code with CloudWatch Monitoring and SNS topic enabled look like this
GitHub Link
Looking forward from you guys to join this journey and spend a minimum an hour every day for the next 100 days on DevOps work and post your progress using any of the below medium.
- Twitter: @100daysofdevops OR @lakhera2015
- Facebook: https://www.facebook.com/groups/795382630808645/
- Medium: https://medium.com/@devopslearning
- Slack: https://devops-myworld.slack.com/messages/CF41EFG49/
- GitHub Link:https://github.com/100daysofdevops
Reference