An ECS service with an ALB target group, suitable for routing to from an ALB.
This repo consolidates the following repos:
- https://github.com/mergermarket/terraform-acuris-ecs-service
- https://github.com/mergermarket/terraform-acuris-load-balanced-ecs-service-no-target-group
- https://github.com/mergermarket/terraform-acuris-task-definition-with-task-role
- https://github.com/mergermarket/terraform-acuris-ecs-container-definition
The "ecs-service-no-target-group" used a series of "if-then" statements to determine which "type" of ecs service to create. Because this required some values to be known before the module ran, it was impossible to create the ecs service and the corresponding target group at the same time. To get around that, we use a new variable named service_type that can be one of the following values:
- service
- service_multiple_load_balancers
- service_no_load_balancer
- service_for_awsvpc_no_loadbalancer
I've only ever used the first of these, so I'm unsire what the others are for, but they are included for completeness
This repo now allows logging through Firelens/Fluent-bit into Datadog.
You can now override the log_configuration variable and pass an optional firelens_configuration variable that will configure the sidecar and fluentbit process. The firehose delivery stream must have already been setup outside of this module.
log_configuration = {
logDriver = "awsfirelens"
options = {
Name = "firehose"
region = module.platform_config.config["region"]
delivery_stream = "DatadogFirehoseStream"
}
}
To enable the firelens sidecar, you MUST provide a firelens_configuration variable. If you do not provide that variable, the logs will flow to cloudwatch and datadog as usual AS LONG AS you don't override the log_configuration. If you override the log_configuration in the above fashion but do not provide a firelens_configuration, your services will break.
The sidecar is named log_router_${var.release["component"]}${var.name_suffix}
and is sourced from public.ecr.aws/aws-observability/aws-for-fluent-bit:stable
The sidecar gets its firelens configuration directly from the variable. You could specify something other than "fluentbit" for the type, but this module won't understand what to do with it and you'll likely end up with a broken service. These options are the only ones availble and you will probably want them as the default fluentbit config doesn't do much. This modue does not create the s3 object, the calling module should do that.
The s3 object you pass for the config-file-value should be a valid fluentbit configuration snippet that will be imported into the fluentbit configuration.
The s3 bucket that object is sourced from should start with the phrase 'firelens' so that the permissions will be applied properly to the ECS Role
firelens_configuration = {
type = "fluentbit"
options = {
enable-ecs-log-metadata = "true"
config-file-type = "s3",
config-file-value = aws_s3_object.fluentbit_config.arn
}
}
The default Fluent-bit config looks like this:
[INPUT]
Name forward
Mem_Buf_Limit 25MB
unix_path /var/run/fluent.sock
[INPUT]
Name forward
Listen 0.0.0.0
Port 24224
[INPUT]
Name tcp
Tag firelens-healthcheck
Listen 127.0.0.1
Port 8877
[FILTER]
Name record_modifier
Match *
Record ec2_instance_id i-0d1a7bebd0e42bc04
Record ecs_cluster or1-test
Record ecs_task_arn arn:aws:ecs:us-west-2:254076036999:task/or1-test/a87638ce0fa0408ba98d11d70dbc66b8
Record ecs_task_definition or1-test-cdflow-log-testing:37
[OUTPUT]
Name null
Match firelens-healthcheck
[OUTPUT]
Name firehose
Match cdflow-log-testing-firelens*
delivery_stream DatadogFirehoseStream
region us-west-2
These are all either defaults or items set up by the ECS task definition.
When you use an external configuration file, this gets added to the config:
@INCLUDE /fluent-bit/etc/external.conf
The contents of that file can be defined with a simple HEREDOC variable such as:
locals{
fluentbit_config = <<-EOF
[FILTER]
name multiline
match *
multiline.key_content log
multiline.parser go
EOF
}
Useful for services behind a load balancer. The load balancer will periodically ping the service and generate a web access entry. If you are logging those, they can get to be a bit much (about 1 every second)
You can prevent those from leaving the sidecar with this config:
[FILTER]
Name grep
Match *
Exclude log ELB-HealthChecker/2.0
This tells fluentbit to use the grep filter (https://docs.fluentbit.io/manual/data-pipeline/filters/grep) and evaluate every entry that comes through. If the "log" field contains "ELB-HealthChecker/2.0" the entry will be silently discarded
You can find more about configuring Fluent-bit here: https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/configuration-file
In order to standardize our use of these logging services, I've created the following repo/module: https://github.com/ION-Analytics/terraform-iona-log-config You're welcome to use this, but it may be tailored too specifically to Backstop's needs.
Name | Version |
---|---|
terraform | >= 1.5.7 |
Name | Version |
---|---|
aws | n/a |
Name | Source | Version |
---|---|---|
ecs_update_monitor | mergermarket/ecs-update-monitor/acuris | 2.3.5 |
Name | Description | Type | Default | Required |
---|---|---|---|---|
add_datadog_feed | Flag to control adding subscription filter to CW loggroup | bool |
true |
no |
allow_overnight_scaledown | Allow service to be scaled down | bool |
true |
no |
application_environment | Environment specific parameters passed to the container | map(string) |
{} |
no |
application_secrets | A list of application specific secret names that can be found in aws secrets manager | list(string) |
[] |
no |
assume_role_policy | A valid IAM policy for assuming roles - optional | string |
"" |
no |
common_application_environment | Environment parameters passed to the container for all environments | map(string) |
{} |
no |
container_labels | Additional docker labels to apply to the container. | map(string) |
{} |
no |
container_mountpoint | Map containing 'sourceVolume', 'containerPath' and 'readOnly' (optional) to map a volume into a container. | map(string) |
{} |
no |
container_port_mappings | JSON document containing an array of port mappings for the container defintion - if set port is ignored (optional). | string |
"" |
no |
cpu | CPU unit reservation for the container | string |
n/a | yes |
deployment_maximum_percent | The maximumPercent parameter represents an upper limit on the number of your service's tasks that are allowed in the RUNNING or PENDING state during a deployment, as a percentage of the desiredCount (rounded down to the nearest integer). | string |
"200" |
no |
deployment_minimum_healthy_percent | The minimumHealthyPercent represents a lower limit on the number of your service's tasks that must remain in the RUNNING state during a deployment, as a percentage of the desiredCount (rounded up to the nearest integer). | string |
"100" |
no |
deployment_timeout | Timeout to wait for the deployment to be finished [seconds]. | number |
600 |
no |
desired_count | The number of instances of the task definition to place and keep running. | string |
"3" |
no |
ecs_cluster | The ECS cluster | string |
"default" |
no |
env | Environment name | any |
n/a | yes |
extra_hosts | List of objects containing 'hostname' and 'ipAddress' used to add extra /etc/hosts to the container. | list(object({'hostname': string 'ipAddress': string}) |
[] |
no |
health_check_grace_period_seconds | Seconds to ignore failing load balancer health checks on newly instantiated tasks to prevent premature shutdown, up to 2147483647. Default 0. | string |
"0" |
no |
image_id | ECR image_id for the ecs container | string |
"" |
no |
is_test | For testing only. Stops the call to AWS for sts | bool |
false |
no |
log_subscription_arn | To enable logging to a kinesis stream | string |
"" |
no |
memory | The memory reservation for the container in megabytes | string |
n/a | yes |
multiple_target_group_arns | Mutiple target group ARNs to allow connection to multiple loadbalancers | list(any) |
[] |
no |
name_suffix | Set a suffix that will be applied to the name in order that a component can have multiple services per environment | string |
"" |
no |
network_configuration_security_groups | needed for network_mode awsvpc | list(any) |
[] |
no |
network_configuration_subnets | needed for network_mode awsvpc | list(any) |
[] |
no |
network_mode | The Docker networking mode to use for the containers in the task | string |
"bridge" |
no |
nofile_soft_ulimit | The soft ulimit for the number of files in container | string |
"4096" |
no |
overnight_scaledown_end_hour | When to bring service back to full strength (Hour in UTC) | string |
"06" |
no |
overnight_scaledown_min_count | Minimum task count overnight | string |
"0" |
no |
overnight_scaledown_start_hour | From when a service can be scaled down (Hour in UTC) | string |
"22" |
no |
pack_and_distinct | Enable distinct instance and task binpacking for better cluster utilisation. Enter 'true' for clusters with auto scaling groups. Enter 'false' for clusters with no ASG and instant counts less than or equal to desired tasks | string |
"false" |
no |
platform_config | Platform configuration | map(string) |
{} |
no |
platform_secrets | A list of common secret names for "the platform" that can be found in secrets manager | list(string) |
[] |
no |
custom_secrets | A list of secret names that can be referenced by multiple services | list(string) |
[] |
no |
port | The port that container will be running on | string |
n/a | yes |
privileged | Gives the container privileged access to the host | bool |
false |
no |
release | Metadata about the release | map(string) |
n/a | yes |
scaling_metrics | A list of maps defining the scaling of the services tasks - for more info see below | list(any) |
[] |
no |
secrets | Secret credentials fetched using credstash | map(string) |
{} |
no |
stop_timeout | The duration is seconds to wait before the container is forcefully killed. Default 30s, max 120s. | string |
"none" |
no |
target_group_arn | The ALB target group for the service. | string |
"" |
no |
task_role_policy | IAM policy document to apply to the tasks via a task role | string |
"{\n \"Version\": \"2012-10-17\",\n \"Statement\": [\n {\n \"Action\": \"sts:GetCallerIdentity\",\n \"Effect\": \"Allow\",\n \"Resource\": \"*\"\n }\n ]\n}\n" |
no |
taskdef_volume | Map containing 'name' and 'host_path' used to add a volume mapping to the taskdef. | map(string) |
{} |
no |
Name | Description |
---|---|
full_service_name | n/a |
stderr_name | n/a |
stdout_name | n/a |
task_role_arn | n/a |
task_role_name | n/a |
taskdef_arn | n/a |
Setting this variable to a lis tof maps. Each map defines a seperate scaling policy
Param | Description |
---|---|
name | (Required) Must be unique |
metric | (Required) Name of the metric to use for scaling - see below for allowed values |
target_value | (Required) Value of the above metric that scaling will maintain |
disable_scale_in | (Optional) Whether scale in by the target tracking policy is disabled. If the value is true, scale in is disabled and the target tracking policy won't remove capacity from the scalable resource. |
scale_in_cooldown | (Optional) Amount of time, in seconds, after a scale in activity completes before another scale in activity can start |
scale_out_cooldown | (Optional) Amount of time, in seconds, after a scale out activity completes before another scale out activity can start. |
- ECSServiceAverageCPUUtilization
- ECSServiceAverageMemoryUtilization
- ALBRequestCountPerTarget
scaling_metrics = [
{
name = "cpu"
metric = "ECSServiceAverageCPUUtilization"
target_value = 10
disable_scale_in = false
scale_in_cooldown = 180
scale_out_cooldown = 90
},
{
name = "memory"
metric = "ECSServiceAverageMemoryUtilization"
target_value = 10
disable_scale_in = false
scale_in_cooldown = 180
scale_out_cooldown = 90
}
]