đ Why donât EC2 CloudWatch Metrics Show Memory and Disk Utilization metrics by default?đ
To understand that, you need to understand the difference between hypervisor and OS-level metrics.
đĽď¸ Virtualization and the Hypervisor:
To understand the distinction, you first need to grasp the basics of virtualization. Virtualization allows multiple virtual machines (VMs), including EC2 instances in the case of AWS, to run on a single physical machine. The component responsible for this is called the hypervisor.
The hypervisor sits between the hardware and the operating systems of the VMs. It allocates resources like CPU, memory, and I/O to the VMs, ensuring they donât interfere with each other.
đ Hypervisor-Level Metrics vs. OS-Level Metrics:
â Hypervisor-Level Metrics: These are metrics that the hypervisor can observe directly without needing to interact with the guest operating systems (the OSes running inside the VMs). Example:
1ď¸âŁ CPU utilization: The hypervisor manages CPU allocation and can measure how much CPU time is allocated to each VM.
2ď¸âŁ Disk I/O: Similarly, disk reads/the hypervisor manages writes to measure these metrics directly.
3ď¸âŁ Network: Network packets go through the hypervisor, allowing it to measure network activity.
â OS-Level Metrics: These metrics relate to whatâs happening inside the virtual machineâs operating system. The hypervisor doesnât have direct access to these metrics because they are encapsulated within the VM. Example:
1ď¸âŁ đ§ Memory Usage: While the hypervisor allocates a certain amount of memory to a VM, it doesnât see how that memory is used inside the VM. For instance, the hypervisor doesnât know which processes within the VM are consuming the most memory or if thereâs free memory available.
2ď¸âŁ đž Disk Space: The hypervisor can observe how much data is read/written, but it doesnât know the available disk space or how itâs partitioned within the VMâs file system.
đ ď¸ How to get memory and disk space metrics into CloudWatch:
While AWS doesnât provide these metrics by default, theyâve made it possible for users to push custom metrics to CloudWatch. The CloudWatch agent can collect additional system-level metrics, including memory and disk space usage.
When you deploy the CloudWatch agent on an EC2 instance, the agent runs as a background service. It gathers the specified system-level metrics at defined intervals and sends them to CloudWatch using the PutMetricData API call. The agent is configurable, allowing you to specify which metrics to collect, the granularity of the data, and how frequently to send it to CloudWatch.
đ If youâre interested in more in-depth explanation of these topics, please check out my new book âCracking the DevOps Interviewâ
đ content:
To learn more about AWS, check out my book âAWS for System Administratorsâ