-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Revert "Revert "data/aws: Switch to m4.large"" #882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/lgtm |
|
👍 Thanks for putting this in @crawford! |
|
Can Terraform express "give me one from a list of instance types ordered by preference"? |
|
@cgwalters even if it could, I don't believe there is an API to check AWS's capacity in certain regions and availability zones. And if there was such an API, it would be racy. By the time we've verified the capacity and then tried to create instances, it may have been consumed by someone else. |
|
/lgtm What's going on in CI? I see lots of "Pending - Job triggered", even for tests that should turn around in minutes. |
Even though we got our AWS quota increased, it looks like AWS just
doesn't have the physical capacity for these machines in us-east-1. We
are seeing a lot of failures in CI:
```
Error: Error applying plan:
1 error occurred:
* module.masters.aws_instance.master[0]: 1 error occurred:
* aws_instance.master.0: Error launching source instance: timeout
while waiting for state to become 'success' (timeout: 30s)
```
Looking at CloudTrail, I see the following error, which corresponds to
that failure:
> We currently do not have sufficient t3.medium capacity in the
> Availability Zone you requested (us-east-1a). Our system will be
> working on provisioning additional capacity. You can currently
> get t3.medium capacity by not specifying an Availability Zone in
> your request or choosing us-east-1d, us-east-1b, us-east-1c,
> us-east-1f.
0ea6d35 to
9f34143
Compare
|
/lgtm |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, crawford, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1 similar comment
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya, crawford, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
info: Manifests will be extracted to /tmp/release-image-0.0.1-2018-12-12-210459800060254
error: unable to connect to image repository registry.svc.ci.openshift.org/ci-op-f54sng97/stable@sha256:58b79dec7b54b6ade89615e2afc9cfdefb2f03bd612f6f27a4eff2763a342443: Get https://registry.svc.ci.openshift.org/v2/: net/http: TLS handshake timeout
2018/12/12 21:05:30 Container release in pod release-latest failed, exit code 1, reason Error |
|
/retest Registry timeout: |
Added a mention in openshift/release#2070 in case that helps bump further investigation there ;). |
Reverts #858
Even though we got our AWS quota increased, it looks like AWS just doesn't have the physical capacity for these machines in us-east-1. We are seeing a lot of failures in CI:
Looking at CloudTrail, I see the following error, which corresponds to that failure: