Skip to content

Commit d56c8af

Browse files
authored
Merge pull request #157 from guardian/nt/alex-fixes
Alex fixes
2 parents c412682 + 197cdcf commit d56c8af

18 files changed

+38
-36
lines changed

.github/workflows/inclusion.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,4 @@ jobs:
2626
uses: actions/setup-node@v4
2727

2828
- name: Run inclusion
29-
run: npx alex -q *.md || echo "Catch warnings and exit 0" # Once all warnings have been resolved, remove the second statement
29+
run: npx alex -q *.md

AWS-costs.md

+9-9
Original file line numberDiff line numberDiff line change
@@ -3,24 +3,24 @@ AWS Costs
33

44
### Trusted Advisor
55

6-
Use the [Trusted Advisor](https://console.aws.amazon.com/trustedadvisor/home?#/dashboard) to identify instances that you can potentially downgrade to a smaller instance size or terminate. Trusted Advisor is a native AWS resource available to you when your account has Enterprise support. It gives recommendations for cost savings opportunities and also provides availability, security, and fault tolerance recommendations. Even simple tunings in CPU usage and provisioned IOPS can add up to significant savings.
6+
Use the [Trusted Advisor](https://console.aws.amazon.com/trustedadvisor/home?#/dashboard) to identify instances that you can potentially downgrade to a smaller instance size or terminate. Trusted Advisor is a native AWS resource available to you when your account has Enterprise support. It gives recommendations for cost savings opportunities and also provides availability, security, and fault tolerance recommendations. Even the simplest tunings, such as to CPU usage and provisioned IOPS can add up to significant savings.
77

88
On the TA dashboard, click on **Low Utilization Amazon EC2 Instances** and sort the low utilisation instances table by the highest **Estimated Monthly Savings**.
99

1010
### Billing & Cost management
1111
You can use the [Bills](https://console.aws.amazon.com/billing/home?region=eu-west-1#/bill) and [Cost explorer](https://console.aws.amazon.com/billing/home?region=eu-west-1#/bill) to understand the breakdown of your AWS usage and possible identify services you didn’t know you were using it.
1212

1313
### Unattached Volumes
14-
Volumes available but not in used costs the same price. You can easily find them in the [EC2 console](https://eu-west-1.console.aws.amazon.com/ec2/v2/home?region=eu-west-1#Volumes:state=available;sort=size) under Volumes section by filtering by state (available).
14+
Volumes available but not in used costs the same price. You can find them in the [EC2 console](https://eu-west-1.console.aws.amazon.com/ec2/v2/home?region=eu-west-1#Volumes:state=available;sort=size) under Volumes section by filtering by state (available).
1515

1616
### Unused AMIs
17-
Unused AMIs cost money. You can easily clean them up using the [AMI cleanup tool](https://github.com/guardian/deploy-tools-platform/tree/master/cleanup)
17+
Unused AMIs cost money. You can clean them up using the [AMI cleanup tool](https://github.com/guardian/deploy-tools-platform/tree/master/cleanup)
1818

1919
### Unattached EIPs
20-
Unattached Elastic IP addresses costs money. You can easily find them using the trust advisor, or looking at your bills as they are free if they are attached (so in use).
20+
Unattached Elastic IP addresses costs money. You can find them using the trust advisor, or looking at your bills as they are free if they are attached (so in use).
2121

2222
### DynamoDB
23-
It’s very easy to overcommit the reserved capacity on this service. You should frequently review the reserved capacity of all your dynamodb tables.
23+
You should frequently review the reserved capacity of all your dynamodb tables to make sure it's not over-committed.
2424
The easiest way to do this is to select the Metric tab and check the Provisioned vs. Consumed write and read capacity graphs and use the Capacity tab to adjust the Provisioned capacity accordingly.
2525
Make sure the table capacity can handle traffic spikes. Use the time range on the graphs to see the past weeks usage.
2626

@@ -38,7 +38,7 @@ Lower storage price, higher access price. Interesting for backups for instance.
3838

3939
* [Reduce Redundancy Storage](https://aws.amazon.com/s3/reduced-redundancy/)
4040

41-
Lower storage price, reduced redundancy. Interesting for easy reproducible data or non critical data such as logs for instance.
41+
Lower storage price, reduced redundancy. Interesting for reproducible data or non-critical data such as logs.
4242

4343
* Glacier
4444

@@ -51,9 +51,9 @@ Another useful feature to manage your buckets is the possibility to set [lifecyc
5151
S3’s multipart upload feature accelerates the uploading of large objects by allowing you to split them up into logical parts that can be uploaded in parallel. However if you initiate a multipart upload but never finish it, the in-progress upload occupies some storage space and will incur storage charges.
5252
And the thing is these uploads are not visible when you list the contents of a bucket through the console or the standard api (you have to use a special command)
5353

54-
There is 2 easy ways to solve this now and prevent it to happen in the future:
54+
There are two ways to solve this now and prevent it from happening in the future:
5555

56-
* a [simple script](https://gist.github.com/mchv/9dccbd9245287b26e34ab78bad43ea6c) that can list them with size and potentially delete existing (based on [AWS API](http://docs.aws.amazon.com/cli/latest/reference/s3api/list-parts.html?highlight=list%20parts))
56+
* a [script](https://gist.github.com/mchv/9dccbd9245287b26e34ab78bad43ea6c) that can list them with size and potentially delete existing (based on [AWS API](http://docs.aws.amazon.com/cli/latest/reference/s3api/list-parts.html?highlight=list%20parts))
5757
* [Add a lifecycle rule](https://aws.amazon.com/blogs/aws/s3-lifecycle-management-update-support-for-multipart-uploads-and-delete-markers/) to each bucket to delete automatically incomplete multipart uploads after a few days ([official AWS doc](http://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html#mpu-abort-incomplete-mpu-lifecycle-config))
5858

5959
An example of how to cloud-form the lifecycle rule:
@@ -81,7 +81,7 @@ You can see savings of over `50%` on reserved instances vs. on-demand instances.
8181
[More info on reserving instances](https://aws.amazon.com/ec2/purchasing-options/reserved-instances/getting-started/).
8282

8383
Reservations are set to a particular AWS region and to a particular instances type.
84-
Therefore after making a reservation you are committing to run that particular region/instances combination until the reservation period finishes or you will swipe off all the financial benefits.
84+
Therefore, after making a reservation you are committing to run that particular region/instances combination until the reservation period finishes, or you will swipe off all the financial benefits.
8585

8686
### Spot Instances
8787

AWS-lambda-metrics.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,15 @@ Metrics for Lambdas
33
* AWS Embedded Metrics are an ideal solution for generating metrics for Lambda functions that will track historical data.
44
* They are a method for capturing Cloudwatch metrics as part of a logging request.
55
* This is good because it avoids the financial and performance cost of making a putMetricData() request.
6-
* It also makes it easy to find the point at which the metric is updated in both the logs and in the code itself.
6+
* It also makes it easier to find the point at which the metric is updated in both the logs and in the code itself.
77
* This does not work at all for our EC2 apps as their logs do not pass through Cloudwatch.
88
* [This pull request](https://github.com/guardian/mobile-n10n/pull/696) gives a working example of how to embed metrics in your logging request
99
* [This document](https://docs.google.com/document/d/1cL_t5NhO8J9Bwiu4rghoGh8i_um_sXDyKuq4COhdLEc/edit?usp=sharing) gives a good summary of why AWS embedded metrics are so useful
1010
* Full details can be found in the [AWS Documentation](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format_Specification.html), but here are the highlights:
1111
* To use AWS Embedded metrics, logs must be in JSON format.
1212
* A metric is embedded in a JSON logging request by adding a root node named “_aws” to the start of the log request.
1313
* The metric details are defined within this "_aws" node.
14-
* The following code snippet shows a simple logging request updating a single metric:
14+
* The following code snippet shows a logging request updating a single metric:
1515

1616
```json
1717
{"_aws": {

AWS.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -41,14 +41,14 @@ VPC
4141

4242
* To follow best practice for VPCs, ensure you have a single CDK-generated VPC in your account that is used to house your applications. You can find the docs for it [here](https://github.com/guardian/cdk/blob/main/src/constructs/vpc/vpc.ts#L32-L59).
4343
* While generally discouraged, in some exceptional cases, such as security-sensitive services, you may want to use the construct to generate further VPCs in order to isolate specific applications. It is worth discussing with DevX Security and InfoSec if you think you have a service that requires this.
44-
* Avoid using the default VPC - The default VPC is designed to make it easy to get up and running but with many negative tradeoffs:
44+
* Avoid using the default VPC - The default VPC is designed to get you up and running quickly, but with many negative tradeoffs:
4545
- It lacks the proper security and auditing controls.
4646
- Network Access Control Lists (NACLs) are unrestricted.
4747
- The default VPC does not enable flow logs. Flow logs allow users to track network flows in the VPC for auditing and troubleshooting purposes
4848
- No tagging
4949
- The default VPC enables the assignment of public addresses in public subnets by default. This is a security issue as a small mistake in setup could
5050
then allow the instance to be reachable by the Internet.
51-
* The account should be allocated a block of our IP address space to support peering. Often you may not know you need peering up front, so better to plan for it just in case. See [here](https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-basics.html) for more info on AWS peering rules.
51+
* The account should be allocated a block of our IP address space to support peering. Often you may not know you need peering up front, so better to plan for it regardless. See [here](https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-basics.html) for more info on AWS peering rules.
5252
* If it is likely that AWS resources will need to communicate with our on-prem infrastructure, then contact the networking team to request a CIDR allocation for the VPC.
5353
* Ensure you have added the correct [Gateway Endpoints](https://docs.aws.amazon.com/vpc/latest/privatelink/vpce-gateway.html) for the AWS services being accessed from your private subnets to avoid incurring unnecessary networking costs.
5454
* Security of the VPC and security groups must be considered. See [here](https://github.com/guardian/security-recommendations/blob/main/recommendations/aws.md#vpc--security-groups) for details.
@@ -116,7 +116,7 @@ and the the function does one or more of the following:
116116

117117
This started happening after a change in how the event loop works between NodeJS 8 and 10. The method AWS uses to freeze the lambda runtime after it has not been invoked for a while may not work correctly in the cases above.
118118

119-
The workaround is simple (if a little silly). Wrap your root handler in a setTimeout:
119+
The workaround is to wrap your root handler in a setTimeout:
120120

121121
```javascript
122122
exports.handler = function (event, context, callback) {
@@ -145,7 +145,7 @@ Your lambda will get triggered multiple times you trigger it synchronously using
145145
#### Details
146146
[`--cli-read-timeout`](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-options.html#:~:text=cli%2Dread%2Dtimeout) is a general CLI param that applies to all subcommands and determines how long it will wait for data to be read from a socket. It seems to default to 60 seconds.
147147

148-
In the case of a synchronously executed long-running lambda, this timeout can be exceeded. The first lambda invocation "fails" (though not in a way that is visible in any lambda metrics or logs), and the CLI will abort the request and retry. The first lambda invocation hasn't really failed though - it will continue to run, possibly successfully - it's just that the CLI client that initiated it has stopped waiting for a response.
148+
In the case of a synchronously executed long-running lambda, this timeout can be exceeded. The first lambda invocation "fails" (though not in a way that is visible in any lambda metrics or logs), and the CLI will abort the request and retry. The first lambda invocation hasn't really failed though - it will continue to run, possibly successfully - but the CLI client that initiated it has stopped waiting for a response.
149149

150150
Setting `--cli-read-timeout` to `0` removes the timeout and make the socket read wait indefinitely, meaning the CLI command will block until the lambda completes or times out.
151151

README.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ This repository document [principles](#principles), standards and [guidelines](#
2222

2323
- Publish services **availability SLA** to **communicate expectations** to users or dependent systems and identify area of improvements.
2424

25-
- Design for **simplicity** and **single responsibility**. Great designs model complex problems as simple discrete components. monitoring, self-healing and graceful degradation are simpler when responsibilities are not conflated.
25+
<!-- alex ignore simple -->
26+
- Design for **simplicity** and **single responsibility**. Great designs model complex problems as simple discrete components. Monitoring, self-healing and graceful degradation are simpler when responsibilities are not conflated.
2627

2728
- **Design for failures**. All things break, so the behaviour of a system when any of its components, collaborators or hosting infrastructure fail or respond slowly must be a key part of its design.
2829

RFCs.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ When you are ready, provide a clear description of:
1616
2. Why the status quo does not address the problem
1717
3. A proposed solution
1818

19-
This will help readers to more easily understand your rationale.
19+
This will help readers to better understand your rationale.
2020

2121
Comments on Google docs or GitHub discussions are good ways of collecting people's thoughts alongside your original proposal.
2222

cdn.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Fastly is highly programmable through its VCL configuration language, but VCL ca
2525

2626
A lot can be achieved with minimal Fastly configuration, and careful use of cache-control, surrogate-control and surrogate-key headers served by your application. This has the advantage that most of the caching logic is co-located with the rest of your application.
2727

28-
If this is insufficient, the next step is making use of [VCL Snippets](https://docs.fastly.com/en/guides/using-regular-vcl-snippets), which can be easily edited in the Fastly console and provide a useful way of providing a little extra functionality. You can try-out snippets of Fastly VCL functionality with https://fiddle.fastly.dev/ .
28+
If this is insufficient, the next step is making use of [VCL Snippets](https://docs.fastly.com/en/guides/using-regular-vcl-snippets), which can be edited in the Fastly console and provide a useful way of providing a little extra functionality. You can try-out snippets of Fastly VCL functionality with https://fiddle.fastly.dev/ .
2929

3030
If you find that your VCL snippets are becoming large, you should consider switching to [custom VCL](https://docs.fastly.com/en/guides/uploading-custom-vcl), which should be versioned in Github, tested in CI and deployed using riff-raff, as in
3131
https://github.com/guardian/fastly-edge-cache.

client-side.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ See the separate [npm-packages.md](./npm-packages.md).
1414
- Gzip all textual assets served, using GZip level 6 where possible
1515
- Optimise images for size (e.g. jpegtran, pngquant, giflossy, svgo,
1616
etc.)
17-
- Favour SVGs where possible. What happens if images are disabled or
17+
- Favour SVGs where possible. What happens if images aren't enabled or
1818
unsupported?
1919
- Avoid inlining encoded assets in CSS.
2020

@@ -81,7 +81,7 @@ various areas below.
8181

8282
- Define what browsers and versions you support. What happens if using an unsupported browser?
8383
- Define what viewports do you support. What happens if using an unsupported viewport?
84-
- What happens if JS/CSS is disabled or overridden in the client?
84+
- What happens if JS/CSS is switched off or overridden in the client?
8585

8686
### Reporting
8787

domain-names.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,11 @@
22

33
## Which DNS provider should I use?
44

5-
NS1 is our preferred supplier for DNS hosting. We pay for their dedicated DNS service, which is independent from their shared platform. This means that even if their shared platform experiences a DDOS attack, our DNS will still be available. It is easy to cloudform DNS records in NS1 using the Guardian::DNS::RecordSet custom resource ([CDK](https://guardian.github.io/cdk/classes/constructs_dns.GuCname.html) / [Cloudformation](https://github.com/guardian/cfn-private-resource-types/tree/main/dns/guardian-dns-record-set-type/docs) docs)
5+
NS1 is our preferred supplier for DNS hosting. We pay for their dedicated DNS service, which is independent from their shared platform. This means that even if their shared platform experiences a DDOS attack, our DNS will still be available. You can cloudform DNS records in NS1 using the Guardian::DNS::RecordSet custom resource ([CDK](https://guardian.github.io/cdk/classes/constructs_dns.GuCname.html) / [Cloudformation](https://github.com/guardian/cfn-private-resource-types/tree/main/dns/guardian-dns-record-set-type/docs) docs)
66

77
### Avoid Route53
88

9-
In the past teams have delegated subdomains to Route53, but this approach is no longer recommended. It is now easy to manage DNS records in NS1 as infrastructure-in-code, so the main benefit of Route53 is eroded. Delegating to Route53 introduces an additional point of failure, since NS1 is authoritative for all of our key domain names. It also makes it harder for engineers and future tooling to reason about a domain.
9+
In the past teams have delegated subdomains to Route53, but this approach is no longer recommended. It is now easier to manage DNS records in NS1 as infrastructure-in-code, so the main benefit of Route53 is eroded. Delegating to Route53 introduces an additional point of failure, since NS1 is authoritative for all of our key domain names. It also makes it harder for engineers and future tooling to reason about a domain.
1010

1111
### Exceptions where Route53 might be a good answer
1212

elasticsearch.md

+1
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ Regular snapshots of your cluster can provide restore points if data is lost. S
1212

1313
If you have the [AWS Plugin](https://github.com/elastic/elasticsearch-cloud-aws) installed you can perform snapshots to S3.
1414

15+
<!-- alex ignore master -->
1516
Some examples of scripts used to setup and run S3 snapshots: https://github.com/guardian/grid/tree/master/elasticsearch/scripts
1617

1718
You can watch snapshots in progress: `curl $ES_URL:9200/_snapshot/_status`

github.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Bear in mind:
2121
* The best visibility for most repositories is `Public`, rather than `Internal` or `Private`.
2222
[Developing in the Open](https://www.theguardian.com/info/developer-blog/2014/nov/28/developing-in-the-open) makes better software!
2323
* Make sure you grant an appropriate focussed [GitHub team](https://github.com/orgs/guardian/teams) full
24-
[`Admin` access to the repo](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/managing-repository-settings/managing-teams-and-people-with-access-to-your-repository#filtering-the-list-of-teams-and-people) - this should be the just the dev team that will be owning this project, it shouldn't be a huge team with hundreds of members!
24+
[`Admin` access to the repo](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/managing-repository-settings/managing-teams-and-people-with-access-to-your-repository#filtering-the-list-of-teams-and-people) - this should be the dev team that will be owning this project, not a huge team with hundreds of members!
2525

2626
We're no longer using https://repo-genesis.herokuapp.com/, as there are many different aspects to setting a GitHub repo up in the best possible
2727
way, and repo-genesis only enforced a couple of them, and only at the point of creation. DevX have plans to enable a new repo-monitoring

0 commit comments

Comments
 (0)