Skip to content
This repository was archived by the owner on Feb 5, 2020. It is now read-only.

Conversation

@enxebre
Copy link
Contributor

@enxebre enxebre commented Oct 18, 2017

*.tectonic.gcp.dev.coreos.system needs to be set to point to gcp NS

@coreosbot
Copy link

Can one of the admins verify this patch?

1 similar comment
@coreosbot
Copy link

Can one of the admins verify this patch?

require 'ssh'

SSH_CMD_CONTAINER_LINUX_VERSION = 'sudo cat /var/lib/update_engine/prefs/aleph-version'
SSH_CMD_CONTAINER_LINUX_VERSION = 'sudo cat /etc/os-release | grep VERSION_ID= | cut -d = -f 2'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change here works for other platforms?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding Container Linux should be shipping with this identification file, which is expected by systemd so should be fine in any platform. This makes the test deterministic as otherwise the test might be run before update-engine generates the aleph-version file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct. This change would revert a fix that was merged in #2093. /etc/os-release contains the current version when we are actually interested in the originally installed version of CL. This change would not make the test deterministic: what if this SSH command was executed on the node after it rebooted? This is actually what was consistently happening and causing the tests to fail. In any case, a possible change would be to:

if [ -f /var/lib/update_engine/prefs/aleph-version ] ; then
  sudo cat /var/lib/update_engine/prefs/aleph-version
else
  source /usr/share/coreos/release && echo "$COREOS_RELEASE_VERSION"
fi

Copy link
Contributor Author

@enxebre enxebre Oct 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"/etc/os-release contains the current version when we are actually interested in the originally installed version of CL" thanks for spotting that @squat

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will rebase with #2193

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2193 is merged please rebase and lets get this merged as well

@cpanato
Copy link
Contributor

cpanato commented Oct 19, 2017

adding smoke tests to run in azure and aws because saw some changes that might affect other platforms

@cpanato
Copy link
Contributor

cpanato commented Oct 19, 2017

ok to test

@cpanato
Copy link
Contributor

cpanato commented Oct 19, 2017

@enxebre got this error:

Using build directory [/home/ubuntu/workspace/tectonic-installer_PR-2178-DVVSYYJUWI2MHJUVIOQLQ7UWE3GSBNYMLXZMI5DXRB3XO73EAL4Q/build/demo]
terraform-docs --no-required markdown  config.tf    >>  Documentation/variables/config.md
terraform-docs --no-required markdown  platforms/aws/variables.tf    >>  Documentation/variables/aws.md
terraform-docs --no-required markdown  platforms/azure/variables.tf    >>  Documentation/variables/azure.md
terraform-docs --no-required markdown  platforms/openstack/neutron/variables.tf    >>  Documentation/variables/openstack-neutron.md
terraform-docs --no-required markdown  platforms/metal/variables.tf    >>  Documentation/variables/metal.md
terraform-docs --no-required markdown  platforms/vmware/variables.tf    >>  Documentation/variables/vmware.md
terraform-docs --no-required markdown  platforms/gcp/variables.tf    >>  Documentation/variables/gcp.md
make[1]: Leaving directory '/home/ubuntu/workspace/tectonic-installer_PR-2178-DVVSYYJUWI2MHJUVIOQLQ7UWE3GSBNYMLXZMI5DXRB3XO73EAL4Q'
diff --git a/Documentation/variables/gcp.md b/Documentation/variables/gcp.md
index 357ca05d..9148d056 100644
--- a/Documentation/variables/gcp.md
+++ b/Documentation/variables/gcp.md
@@ -15,6 +15,7 @@ This document gives an overview of variables used in the Google Cloud platform o
 | tectonic_gcp_master_disktype | The type of disk (pd-standard or pd-ssd) for the master nodes. | string | `pd-standard` |
 | tectonic_gcp_master_gce_type | Instance size for the master node(s). Example: `n1-standard-2`. | string | `n1-standard-2` |
 | tectonic_gcp_region | The GCP region to use. Some regions only have 2 zones. | string | - |
+| tectonic_gcp_ssh_key | (required) Path to an SSH public key file to be provisioned as the SSH key for the 'core' user. | string | - |
 | tectonic_gcp_worker_disk_size | The size of the disk in gigabytes for the root block device of worker nodes. | string | `30` |
 | tectonic_gcp_worker_disktype | The type of disk (pd-standard or pd-ssd) for the worker nodes. | string | `pd-standard` |
 | tectonic_gcp_worker_gce_type | Instance size for the worker node(s). Example: `n1-standard-2`. | string | `n1-standard-2` |
outdated docs (run 'make docs' to fix)
Makefile:139: recipe for target 'structure-check' failed
make: *** [structure-check] Error 1

@enxebre
Copy link
Contributor Author

enxebre commented Oct 19, 2017

@cpanato thanks for having a look, variables issue should be fixed now.

@enxebre enxebre force-pushed the gcp-tests branch 2 times, most recently from 555e75f to 103e576 Compare October 19, 2017 17:07
Makefile Outdated
-e GOOGLE_CLOUD_KEYFILE_JSON \
-e GCLOUD_KEYFILE_JSON \
-e GOOGLE_PROJECT \
-e tectonic_gcp_ssh_key \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be TF_VAR_tectonic_gcp_ssh_key


resource "google_compute_forwarding_rule" "tectonic-api-external-ssh-fwd-rule" {
load_balancing_scheme = "EXTERNAL"
name = "tectonic-api-external-ssh-fwd-rule"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove tectonic- from the name for consistency with other platforms, e.g. https://github.com/coreos/tectonic-installer/blob/master/modules/aws/master-asg/master.tf#L24

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will rebase an change this after #2167 gets in

@@ -1,5 +1,6 @@
resource "google_compute_target_pool" "tectonic-master-targetpool" {
name = "tectonic-master-targetpool"
name = "tectonic-master-targetpool"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as below

require 'ssh'

SSH_CMD_CONTAINER_LINUX_VERSION = 'sudo cat /var/lib/update_engine/prefs/aleph-version'
SSH_CMD_CONTAINER_LINUX_VERSION = 'sudo cat /etc/os-release | grep VERSION_ID= | cut -d = -f 2'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct. This change would revert a fix that was merged in #2093. /etc/os-release contains the current version when we are actually interested in the originally installed version of CL. This change would not make the test deterministic: what if this SSH command was executed on the node after it rebooted? This is actually what was consistently happening and causing the tests to fail. In any case, a possible change would be to:

if [ -f /var/lib/update_engine/prefs/aleph-version ] ; then
  sudo cat /var/lib/update_engine/prefs/aleph-version
else
  source /usr/share/coreos/release && echo "$COREOS_RELEASE_VERSION"
fi

squat
squat previously approved these changes Oct 20, 2017
@enxebre
Copy link
Contributor Author

enxebre commented Oct 20, 2017

Rebased, should be good now

@squat
Copy link
Contributor

squat commented Oct 20, 2017

ok to test

@squat
Copy link
Contributor

squat commented Oct 20, 2017

@cpanato, I saw this on the AWS VPC test:


Destroy complete! Resources: 24 destroyed.

An error occurred in an `after(:context)` hook.
Failure/Error: FileUtils.cp '/etc/resolv.conf.bak', '/etc/resolv.conf'

Errno::ENOENT:
  No such file or directory @ rb_sysopen - /etc/resolv.conf.bak
# ./lib/aws_vpc.rb:115:in `recover_etc_resolv'
# ./lib/aws_vpc.rb:85:in `ensure in destroy'
# ./lib/aws_vpc.rb:85:in `destroy'
# ./spec/aws_vpc_internal_spec.rb:25:in `block (2 levels) in <top (required)>'
# ------------------
# --- Caused by: ---
# NoMethodError:
#   undefined method `stop' for nil:NilClass
#   ./lib/aws_vpc.rb:80:in `destroy'

@squat
Copy link
Contributor

squat commented Oct 20, 2017

@enxebre looking good but we need to rebase now that the resource rename pr merged.

@enxebre enxebre force-pushed the gcp-tests branch 2 times, most recently from 8ea2670 to c1214a7 Compare October 20, 2017 19:08
@enxebre
Copy link
Contributor Author

enxebre commented Oct 20, 2017

@squat thanks, just rebased it

@enxebre
Copy link
Contributor Author

enxebre commented Oct 20, 2017

With the latest rebase I'm seeing this happening occasionally when running the tests on gcp:

     Failure/Error: driver = Selenium::WebDriver.for :chrome, options: options
     
     Selenium::WebDriver::Error::WebDriverError:
       unable to connect to chromedriver 127.0.0.1:9515

haven't dug into it yet

@enxebre
Copy link
Contributor Author

enxebre commented Oct 24, 2017

Test are consistently working on gcp now with this PR, the two failing seems unrelated? From my perspective this one should be good to go cc @squat @cpanato

@squat
Copy link
Contributor

squat commented Oct 25, 2017

Yes does seem unrelated. I’d like to see if recent changes to master fix these flakes

@squat
Copy link
Contributor

squat commented Oct 25, 2017

ok to test

@cpanato
Copy link
Contributor

cpanato commented Oct 25, 2017

the issues in Azure is because the latest CL image is not there :/

@cpanato
Copy link
Contributor

cpanato commented Oct 25, 2017

@squat @enxebre tests are green

@cpanato
Copy link
Contributor

cpanato commented Oct 25, 2017

I will wait to this get merged then I will together with @enxebre to merge #2184

@enxebre enxebre mentioned this pull request Oct 26, 2017
@squat squat merged commit d6f8644 into coreos:master Oct 26, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants