Optimizing Amazon CloudWatch for peak efficency and cost savings
By: Ernesto Buenrostro | October 12, 2023 | AWS
Monitoring your web infrastructure is key for several reasons. It enables you to gain insights into the performance and health of your resources, identify potential issues, and make informed decisions for optimization. AWS offers a comprehensive monitoring solution through Amazon CloudWatch, which provides built-in metrics and the flexibility to create custom metrics. However, while comprehensive, the free monitoring options available still might not always meet your needs, and upgrading to the next level of configuration might be overkill and drive up your hosting costs. Auditing and customizing your CloudWatch infrastructure is a great way to get the metrics you desire and save money along the way.
CloudWatch provides two categories of monitoring: basic monitoring and detailed monitoring. Many AWS services offer basic monitoring by default, including a default set of metrics published to CloudWatch at no additional cost. When you start using these AWS services, basic monitoring is automatically enabled. You can refer to the list of services that offer basic monitoring in the AWS documentation.
If you need additional metrics and services, detailed monitoring is available for an additional charge. It allows you to access metrics at a higher resolution and provides more frequent data points. You can enable detailed monitoring for specific AWS resources to gain deeper insights into their performance.
However, if you only want specific parts of those increased data points, such as disk space usage tracking, then the default settings in the detailed monitoring might come with a bit of sticker shock once you see your bill or review AWS Cost Explorer. It's important to set up CloudWatch efficiently to avoid unexpected costs. I'll take you through some key points to consider when configuring CloudWatch and provide an example of how to customize it.
Built-in metrics and custom metrics
AWS provides a wide range of built-in metrics for various services, including EC2 instances, EBS volumes, and RDS DB instances. These metrics give you valuable insights into the health and performance of your resources. Additionally, CloudWatch allows you to create custom metrics, enabling you to monitor specific aspects of your infrastructure and applications that are important to your business.
AWS also provides a CloudWatch agent, which offers additional custom metrics that can be configured. These metrics monitor different aspects of your EC2 instances.
Customizing CloudWatch agent metrics
The CloudWatch agent is an agent provided by AWS that allows you to collect metrics from your EC2 instances and on-premises servers. By default, the CloudWatch agent comes with a predefined set of metrics that it tracks. However, it is recommended to customize the agent configuration to push only the essential metrics that you need. This customization helps prevent unnecessary CloudWatch costs for unused metrics while still allowing you to track the metrics that are most important to you.
Configuration options
When setting up CloudWatch, you have various configuration options to choose from. It's crucial to carefully consider these options to avoid unnecessary costs. Even the most basic configurations can result in unexpected expenses that accumulate quickly. Take the time to evaluate which metrics are essential for your monitoring needs.
Disk space usage
One key metric that is often overlooked is disk space usage. Monitoring disk space is crucial for ensuring the availability and performance of your systems. But the default disk space usage configuration tracks all your partitions. This can result in a dozen or more metrics per server, which can add up quickly in terms of cost. You likely are only interested in tracking one or two partitions, such as your main operating system partition and a separate partition for your application. By configuring CloudWatch to track disk space usage efficiently, you can avoid unnecessary costs. Instead of spending hundreds of dollars per month, a well-optimized configuration might only cost a few dollars.
How to configure your CloudWatch agent
Here is an example of how to configure the CloudWatch agent (in this example I am assuming you have the CloudWatch agent installed on an EC2 instance; see "Installing the CloudWatch agent").
The CloudWatch agent comes with predefined metric sets.
Run the CloudWatch agent configuration wizard to create the configuration file.
1. To create the CloudWatch agent configuration file, run the Amazon CloudWatch agent as follows.
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
2. Answer the questions to customize the configuration file for your server.
3. The configuration file config.json is stored in /opt/aws/amazon-cloudwatch-agent/bin/.
4. You can update the Amazon CloudWatch agent configuration, and then reload the configuration again.
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s
After the configuration file is created, we should go and update it to include only the metrics we want to use.
If we don't update it, we will end up with a configuration file like the following one. This configuration file will result in an excessive number of metrics being sent to CloudWatch, leading to a substantial increase in costs.
{
"agent": {
"metrics_collection_interval": 60,
"run_as_user": "cwagent"
},
"metrics": {
"append_dimensions": {
"AutoScalingGroupName": "${aws:AutoScalingGroupName}",
"ImageId": "${aws:ImageId}",
"InstanceId": "${aws:InstanceId}",
"InstanceType": "${aws:InstanceType}"
},
"metrics_collected": {
"cpu": {
"measurement": [
"cpu_usage_idle",
"cpu_usage_iowait",
"cpu_usage_user",
"cpu_usage_system"
],
"metrics_collection_interval": 60,
"totalcpu": false
},
"disk": {
"measurement": [
"used_percent",
"inodes_free"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"diskio": {
"measurement": [
"io_time"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"mem": {
"measurement": [
"mem_used_percent"
],
"metrics_collection_interval": 60
},
"swap": {
"measurement": [
"swap_used_percent"
],
"metrics_collection_interval": 60
}
}
}
}
Here is the basic configuration file where we only set the Amazon CloudWatch agent to push the used disk space percentage. Use this resource to manually create or edit the CloudWatch agent configuration file.
{
"agent": {
"metrics_collection_interval": 60,
"run_as_user": "cwagent"
},
"metrics": {
"aggregation_dimensions": [
[
"InstanceId"
]
],
"append_dimensions": {
"AutoScalingGroupName": "${aws:AutoScalingGroupName}",
"ImageId": "${aws:ImageId}",
"InstanceId": "${aws:InstanceId}",
"InstanceType": "${aws:InstanceType}"
},
"metrics_collected": {
"disk": {
"measurement": [
"used_percent"
],
"metrics_collection_interval": 60,
"resources": [
"/",
"/Data"
]
}
}
}
}
Start saving money by managing your CloudWatch metrics
Setting up CloudWatch metrics efficiently is crucial for optimizing costs and monitoring the metrics that matter most to your applications. By leveraging basic and detailed monitoring, and customizing the CloudWatch agent configuration, you can ensure that you are collecting the right metrics while avoiding unnecessary expenses.