You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: AWS-costs.md
+9-9
Original file line number
Diff line number
Diff line change
@@ -3,24 +3,24 @@ AWS Costs
3
3
4
4
### Trusted Advisor
5
5
6
-
Use the [Trusted Advisor](https://console.aws.amazon.com/trustedadvisor/home?#/dashboard) to identify instances that you can potentially downgrade to a smaller instance size or terminate. Trusted Advisor is a native AWS resource available to you when your account has Enterprise support. It gives recommendations for cost savings opportunities and also provides availability, security, and fault tolerance recommendations. Even simple tunings in CPU usage and provisioned IOPS can add up to significant savings.
6
+
Use the [Trusted Advisor](https://console.aws.amazon.com/trustedadvisor/home?#/dashboard) to identify instances that you can potentially downgrade to a smaller instance size or terminate. Trusted Advisor is a native AWS resource available to you when your account has Enterprise support. It gives recommendations for cost savings opportunities and also provides availability, security, and fault tolerance recommendations. Even the simplest tunings, such as to CPU usage and provisioned IOPS can add up to significant savings.
7
7
8
8
On the TA dashboard, click on **Low Utilization Amazon EC2 Instances** and sort the low utilisation instances table by the highest **Estimated Monthly Savings**.
9
9
10
10
### Billing & Cost management
11
11
You can use the [Bills](https://console.aws.amazon.com/billing/home?region=eu-west-1#/bill) and [Cost explorer](https://console.aws.amazon.com/billing/home?region=eu-west-1#/bill) to understand the breakdown of your AWS usage and possible identify services you didn’t know you were using it.
12
12
13
13
### Unattached Volumes
14
-
Volumes available but not in used costs the same price. You can easily find them in the [EC2 console](https://eu-west-1.console.aws.amazon.com/ec2/v2/home?region=eu-west-1#Volumes:state=available;sort=size) under Volumes section by filtering by state (available).
14
+
Volumes available but not in used costs the same price. You can find them in the [EC2 console](https://eu-west-1.console.aws.amazon.com/ec2/v2/home?region=eu-west-1#Volumes:state=available;sort=size) under Volumes section by filtering by state (available).
15
15
16
16
### Unused AMIs
17
-
Unused AMIs cost money. You can easily clean them up using the [AMI cleanup tool](https://github.com/guardian/deploy-tools-platform/tree/master/cleanup)
17
+
Unused AMIs cost money. You can clean them up using the [AMI cleanup tool](https://github.com/guardian/deploy-tools-platform/tree/master/cleanup)
18
18
19
19
### Unattached EIPs
20
-
Unattached Elastic IP addresses costs money. You can easily find them using the trust advisor, or looking at your bills as they are free if they are attached (so in use).
20
+
Unattached Elastic IP addresses costs money. You can find them using the trust advisor, or looking at your bills as they are free if they are attached (so in use).
21
21
22
22
### DynamoDB
23
-
It’s very easy to overcommit the reserved capacity on this service. You should frequently review the reserved capacity of all your dynamodb tables.
23
+
You should frequently review the reserved capacity of all your dynamodb tables to make sure it's not over-committed.
24
24
The easiest way to do this is to select the Metric tab and check the Provisioned vs. Consumed write and read capacity graphs and use the Capacity tab to adjust the Provisioned capacity accordingly.
25
25
Make sure the table capacity can handle traffic spikes. Use the time range on the graphs to see the past weeks usage.
26
26
@@ -38,7 +38,7 @@ Lower storage price, higher access price. Interesting for backups for instance.
Lower storage price, reduced redundancy. Interesting for easy reproducible data or noncritical data such as logs for instance.
41
+
Lower storage price, reduced redundancy. Interesting for reproducible data or non-critical data such as logs.
42
42
43
43
* Glacier
44
44
@@ -51,9 +51,9 @@ Another useful feature to manage your buckets is the possibility to set [lifecyc
51
51
S3’s multipart upload feature accelerates the uploading of large objects by allowing you to split them up into logical parts that can be uploaded in parallel. However if you initiate a multipart upload but never finish it, the in-progress upload occupies some storage space and will incur storage charges.
52
52
And the thing is these uploads are not visible when you list the contents of a bucket through the console or the standard api (you have to use a special command)
53
53
54
-
There is 2 easy ways to solve this now and prevent it to happen in the future:
54
+
There are two ways to solve this now and prevent it from happening in the future:
55
55
56
-
* a [simple script](https://gist.github.com/mchv/9dccbd9245287b26e34ab78bad43ea6c) that can list them with size and potentially delete existing (based on [AWS API](http://docs.aws.amazon.com/cli/latest/reference/s3api/list-parts.html?highlight=list%20parts))
56
+
* a [script](https://gist.github.com/mchv/9dccbd9245287b26e34ab78bad43ea6c) that can list them with size and potentially delete existing (based on [AWS API](http://docs.aws.amazon.com/cli/latest/reference/s3api/list-parts.html?highlight=list%20parts))
57
57
*[Add a lifecycle rule](https://aws.amazon.com/blogs/aws/s3-lifecycle-management-update-support-for-multipart-uploads-and-delete-markers/) to each bucket to delete automatically incomplete multipart uploads after a few days ([official AWS doc](http://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html#mpu-abort-incomplete-mpu-lifecycle-config))
58
58
59
59
An example of how to cloud-form the lifecycle rule:
@@ -81,7 +81,7 @@ You can see savings of over `50%` on reserved instances vs. on-demand instances.
81
81
[More info on reserving instances](https://aws.amazon.com/ec2/purchasing-options/reserved-instances/getting-started/).
82
82
83
83
Reservations are set to a particular AWS region and to a particular instances type.
84
-
Therefore after making a reservation you are committing to run that particular region/instances combination until the reservation period finishes or you will swipe off all the financial benefits.
84
+
Therefore, after making a reservation you are committing to run that particular region/instances combination until the reservation period finishes, or you will swipe off all the financial benefits.
Copy file name to clipboardExpand all lines: AWS-lambda-metrics.md
+2-2
Original file line number
Diff line number
Diff line change
@@ -3,15 +3,15 @@ Metrics for Lambdas
3
3
* AWS Embedded Metrics are an ideal solution for generating metrics for Lambda functions that will track historical data.
4
4
* They are a method for capturing Cloudwatch metrics as part of a logging request.
5
5
* This is good because it avoids the financial and performance cost of making a putMetricData() request.
6
-
* It also makes it easy to find the point at which the metric is updated in both the logs and in the code itself.
6
+
* It also makes it easier to find the point at which the metric is updated in both the logs and in the code itself.
7
7
* This does not work at all for our EC2 apps as their logs do not pass through Cloudwatch.
8
8
*[This pull request](https://github.com/guardian/mobile-n10n/pull/696) gives a working example of how to embed metrics in your logging request
9
9
*[This document](https://docs.google.com/document/d/1cL_t5NhO8J9Bwiu4rghoGh8i_um_sXDyKuq4COhdLEc/edit?usp=sharing) gives a good summary of why AWS embedded metrics are so useful
10
10
* Full details can be found in the [AWS Documentation](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format_Specification.html), but here are the highlights:
11
11
* To use AWS Embedded metrics, logs must be in JSON format.
12
12
* A metric is embedded in a JSON logging request by adding a root node named “_aws” to the start of the log request.
13
13
* The metric details are defined within this "_aws" node.
14
-
* The following code snippet shows a simple logging request updating a single metric:
14
+
* The following code snippet shows a logging request updating a single metric:
Copy file name to clipboardExpand all lines: AWS.md
+4-4
Original file line number
Diff line number
Diff line change
@@ -41,14 +41,14 @@ VPC
41
41
42
42
* To follow best practice for VPCs, ensure you have a single CDK-generated VPC in your account that is used to house your applications. You can find the docs for it [here](https://github.com/guardian/cdk/blob/main/src/constructs/vpc/vpc.ts#L32-L59).
43
43
* While generally discouraged, in some exceptional cases, such as security-sensitive services, you may want to use the construct to generate further VPCs in order to isolate specific applications. It is worth discussing with DevX Security and InfoSec if you think you have a service that requires this.
44
-
* Avoid using the default VPC - The default VPC is designed to make it easy to get up and running but with many negative tradeoffs:
44
+
* Avoid using the default VPC - The default VPC is designed to get you up and running quickly, but with many negative tradeoffs:
45
45
- It lacks the proper security and auditing controls.
46
46
- Network Access Control Lists (NACLs) are unrestricted.
47
47
- The default VPC does not enable flow logs. Flow logs allow users to track network flows in the VPC for auditing and troubleshooting purposes
48
48
- No tagging
49
49
- The default VPC enables the assignment of public addresses in public subnets by default. This is a security issue as a small mistake in setup could
50
50
then allow the instance to be reachable by the Internet.
51
-
* The account should be allocated a block of our IP address space to support peering. Often you may not know you need peering up front, so better to plan for it just in case. See [here](https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-basics.html) for more info on AWS peering rules.
51
+
* The account should be allocated a block of our IP address space to support peering. Often you may not know you need peering up front, so better to plan for it regardless. See [here](https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-basics.html) for more info on AWS peering rules.
52
52
* If it is likely that AWS resources will need to communicate with our on-prem infrastructure, then contact the networking team to request a CIDR allocation for the VPC.
53
53
* Ensure you have added the correct [Gateway Endpoints](https://docs.aws.amazon.com/vpc/latest/privatelink/vpce-gateway.html) for the AWS services being accessed from your private subnets to avoid incurring unnecessary networking costs.
54
54
* Security of the VPC and security groups must be considered. See [here](https://github.com/guardian/security-recommendations/blob/main/recommendations/aws.md#vpc--security-groups) for details.
@@ -116,7 +116,7 @@ and the the function does one or more of the following:
116
116
117
117
This started happening after a change in how the event loop works between NodeJS 8 and 10. The method AWS uses to freeze the lambda runtime after it has not been invoked for a while may not work correctly in the cases above.
118
118
119
-
The workaround is simple (if a little silly). Wrap your root handler in a setTimeout:
119
+
The workaround is to wrap your root handler in a setTimeout:
@@ -145,7 +145,7 @@ Your lambda will get triggered multiple times you trigger it synchronously using
145
145
#### Details
146
146
[`--cli-read-timeout`](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-options.html#:~:text=cli%2Dread%2Dtimeout) is a general CLI param that applies to all subcommands and determines how long it will wait for data to be read from a socket. It seems to default to 60 seconds.
147
147
148
-
In the case of a synchronously executed long-running lambda, this timeout can be exceeded. The first lambda invocation "fails" (though not in a way that is visible in any lambda metrics or logs), and the CLI will abort the request and retry. The first lambda invocation hasn't really failed though - it will continue to run, possibly successfully - it's just that the CLI client that initiated it has stopped waiting for a response.
148
+
In the case of a synchronously executed long-running lambda, this timeout can be exceeded. The first lambda invocation "fails" (though not in a way that is visible in any lambda metrics or logs), and the CLI will abort the request and retry. The first lambda invocation hasn't really failed though - it will continue to run, possibly successfully - but the CLI client that initiated it has stopped waiting for a response.
149
149
150
150
Setting `--cli-read-timeout` to `0` removes the timeout and make the socket read wait indefinitely, meaning the CLI command will block until the lambda completes or times out.
Copy file name to clipboardExpand all lines: README.md
+2-1
Original file line number
Diff line number
Diff line change
@@ -22,7 +22,8 @@ This repository document [principles](#principles), standards and [guidelines](#
22
22
23
23
- Publish services **availability SLA** to **communicate expectations** to users or dependent systems and identify area of improvements.
24
24
25
-
- Design for **simplicity** and **single responsibility**. Great designs model complex problems as simple discrete components. monitoring, self-healing and graceful degradation are simpler when responsibilities are not conflated.
25
+
<!-- alex ignore simple -->
26
+
- Design for **simplicity** and **single responsibility**. Great designs model complex problems as simple discrete components. Monitoring, self-healing and graceful degradation are simpler when responsibilities are not conflated.
26
27
27
28
-**Design for failures**. All things break, so the behaviour of a system when any of its components, collaborators or hosting infrastructure fail or respond slowly must be a key part of its design.
Copy file name to clipboardExpand all lines: cdn.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,7 @@ Fastly is highly programmable through its VCL configuration language, but VCL ca
25
25
26
26
A lot can be achieved with minimal Fastly configuration, and careful use of cache-control, surrogate-control and surrogate-key headers served by your application. This has the advantage that most of the caching logic is co-located with the rest of your application.
27
27
28
-
If this is insufficient, the next step is making use of [VCL Snippets](https://docs.fastly.com/en/guides/using-regular-vcl-snippets), which can be easily edited in the Fastly console and provide a useful way of providing a little extra functionality. You can try-out snippets of Fastly VCL functionality with https://fiddle.fastly.dev/ .
28
+
If this is insufficient, the next step is making use of [VCL Snippets](https://docs.fastly.com/en/guides/using-regular-vcl-snippets), which can be edited in the Fastly console and provide a useful way of providing a little extra functionality. You can try-out snippets of Fastly VCL functionality with https://fiddle.fastly.dev/ .
29
29
30
30
If you find that your VCL snippets are becoming large, you should consider switching to [custom VCL](https://docs.fastly.com/en/guides/uploading-custom-vcl), which should be versioned in Github, tested in CI and deployed using riff-raff, as in
Copy file name to clipboardExpand all lines: domain-names.md
+2-2
Original file line number
Diff line number
Diff line change
@@ -2,11 +2,11 @@
2
2
3
3
## Which DNS provider should I use?
4
4
5
-
NS1 is our preferred supplier for DNS hosting. We pay for their dedicated DNS service, which is independent from their shared platform. This means that even if their shared platform experiences a DDOS attack, our DNS will still be available. It is easy to cloudform DNS records in NS1 using the Guardian::DNS::RecordSet custom resource ([CDK](https://guardian.github.io/cdk/classes/constructs_dns.GuCname.html) / [Cloudformation](https://github.com/guardian/cfn-private-resource-types/tree/main/dns/guardian-dns-record-set-type/docs) docs)
5
+
NS1 is our preferred supplier for DNS hosting. We pay for their dedicated DNS service, which is independent from their shared platform. This means that even if their shared platform experiences a DDOS attack, our DNS will still be available. You can cloudform DNS records in NS1 using the Guardian::DNS::RecordSet custom resource ([CDK](https://guardian.github.io/cdk/classes/constructs_dns.GuCname.html) / [Cloudformation](https://github.com/guardian/cfn-private-resource-types/tree/main/dns/guardian-dns-record-set-type/docs) docs)
6
6
7
7
### Avoid Route53
8
8
9
-
In the past teams have delegated subdomains to Route53, but this approach is no longer recommended. It is now easy to manage DNS records in NS1 as infrastructure-in-code, so the main benefit of Route53 is eroded. Delegating to Route53 introduces an additional point of failure, since NS1 is authoritative for all of our key domain names. It also makes it harder for engineers and future tooling to reason about a domain.
9
+
In the past teams have delegated subdomains to Route53, but this approach is no longer recommended. It is now easier to manage DNS records in NS1 as infrastructure-in-code, so the main benefit of Route53 is eroded. Delegating to Route53 introduces an additional point of failure, since NS1 is authoritative for all of our key domain names. It also makes it harder for engineers and future tooling to reason about a domain.
10
10
11
11
### Exceptions where Route53 might be a good answer
Copy file name to clipboardExpand all lines: github.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ Bear in mind:
21
21
* The best visibility for most repositories is `Public`, rather than `Internal` or `Private`.
22
22
[Developing in the Open](https://www.theguardian.com/info/developer-blog/2014/nov/28/developing-in-the-open) makes better software!
23
23
* Make sure you grant an appropriate focussed [GitHub team](https://github.com/orgs/guardian/teams) full
24
-
[`Admin` access to the repo](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/managing-repository-settings/managing-teams-and-people-with-access-to-your-repository#filtering-the-list-of-teams-and-people) - this should be the just the dev team that will be owning this project, it shouldn't be a huge team with hundreds of members!
24
+
[`Admin` access to the repo](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/managing-repository-settings/managing-teams-and-people-with-access-to-your-repository#filtering-the-list-of-teams-and-people) - this should be the dev team that will be owning this project, not a huge team with hundreds of members!
25
25
26
26
We're no longer using https://repo-genesis.herokuapp.com/, as there are many different aspects to setting a GitHub repo up in the best possible
27
27
way, and repo-genesis only enforced a couple of them, and only at the point of creation. DevX have plans to enable a new repo-monitoring
0 commit comments