We love metrics but hate manual processes. When we adopted Datadog’s builtin AWS integration we couldn’t wait to get AWS CloudWatch metrics into Datadog, but first we needed to automate the numerous manual steps required to set it up. Datadog’s AWS integration is quite powerful, once enabled it will automatically synchronize specified CloudWatch metrics into a Datadog account. Basically, anything available within CloudWatch, can be easily made available in Datadog, alongside all of our other metrics and dashboards.
Despite the integration’s power and convenience, its setup process is actually quite involved. As outlined in Datadog’s documentation, there are 18 manual steps required, including:
- finding the right AWS account ID
- creating the right IAM policy
- copy pasting the right AWS resource ID into Datadog UI
If you have more than a few AWS accounts like we do, you may prefer to automate this! In our case, that means using Terraform.
In this blog post, we would like to share how Scribd uses Terraform to automate our Datadog and AWS integration across the organization.
Enable Datadog’s builtin AWS integration
To address this problem, we built the terraform-aws-datadog module. With only couple lines of HCL code, Terraform will perform all the necessary steps to setup Datadog integration with a specific AWS account with Scribd’s best practices:
module "datadog" {
source = "git::https://github.com/scribd/terraform-aws-datadog.git?ref=master"
aws_account_id = data.aws_caller_identity.current.account_id
datadog_api_key = var.datadog_api_key
env = "prod"
namespace = "team_foo"
}
The benefit from an AWS Account maintainer point of view is that using the module is a convenient way to inherit centralized best practice. For module maintainers, any change to the Datadog integration module can be released using a standard Terraform module release process.
CloudWatch log synchronization
Initially, the module only sets up the base integration. As adoption increased, more features were added to the module by various teams. One of these features is automation for setting up log ingestion for CloudWatch.
Like setting up the official AWS integration app, the instructions for log synchronization are a bit overwhelming.
However, using the terraform-aws-datadog
module, we can enable the feature with a single parameter:
module "datadog" {
source = "git::https://github.com/scribd/terraform-aws-datadog.git?ref=master"
datadog_api_key = var.datadog_api_key
env = "prod"
namespace = "project_foo"
cloudwatch_log_groups = ["cloudwatch_log_group_1", "cloudwatch_log_group_2"]
}
That’s it! Terraform will automatically create the Datadog serverless function
and triggers for specified log groups to forward all CloudWatch logs into
Datadog. After running terraform apply
, you should be able to see logs showing
up in Datadog within minutes.
Future work
With both metrics and logs synchronized into Datadog, we are able to leverage Datadog as the central hub for all things monitoring. We are planning to bring more features to the module as we migrate Scribd’s infrastructure into AWS.
Metrics ingested through the official AWS integration are delayed by couple minutes, which is not ideal to use as signals for monitoring critical systems. There are opportunities to enable real time metrics synchronization by automating Datadog agent setup.
The datadog-serverless-functions
repo
contains two other lambda based AWS augmentations that we may add as available
features of the module: vpc_flow_log_monitoring
and rds_enhanced_monitoring
.
Stay apprised of future releases by watching our release page.
Special shout out to Taylor McClure and Hamilton Hord for starting the project, as well as Sai Kiran Burle, Kamran Farhadi and Eugene Pimenov for improvements and bug fixes.