Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support global parallelism testing through environment variable. #598

Closed
wants to merge 13 commits into from
Closed

Conversation

metacpp
Copy link
Contributor

@metacpp metacpp commented Dec 4, 2017

  1. Support parallel testing for all acceptance tests through TF_PARL.
  2. Update the makefile to support run acc test cases with 20 threads.

1. Introduced aztesting to support parallel testing at function level.
2. Switch to aztesting for all acc tests of storage(28 in total).
3. Update the makefile to support run acc test cases with 20 threads.

TODO:
1. After merge, make sure that CI of Hashicorp is not broken.
2. Expand to all other acc tests.
3. Enable Jenkins CI with daily run.
@tombuildsstuff tombuildsstuff removed the request for review from vancluever December 5, 2017 11:43
Copy link
Contributor

@tombuildsstuff tombuildsstuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @metacpp

Thanks for this PR - I've taken a look through and left some comments in-line.

Thanks

GNUmakefile Outdated
# threads:
# TF_PARA=1 make testacc
# 2. Enable parallel run for acceptance tests with matching pattern.
# TF_PARA=1 TESTARGS="-run TestAccAzureRMStorage" make testacc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be better to document this in the README instead, rather than making people dig into the Makefile?

Copy link
Contributor Author

@metacpp metacpp Dec 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, I will add example usage in the README.

@@ -12,6 +12,9 @@ import (
"github.com/hashicorp/terraform/terraform"
)

// ParaTestEnvVar is used to enable parallelism in testing.
const ParaTestEnvVar = "TF_PARA"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest this is TF_PARALLEL rather than PARA so it's clearer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was struggling the naming principles here:

  1. TF_PARALLEL: clear but a bit longer while using it the command line.
  2. TF_PARA: short to type in command line, but as you said, not quite clear, might confuse people.

Then I took a look at TF_ACC, I decided to use #2 to keep just one naming principle in the same source file.

@jeffreyCline , @JunyiYi what do you think ? I am open to use either of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.allacronyms.com/parallel/abbreviated

We decided to use "TF_PARL" instead according to above link.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@metacpp whilst I understand that - we tend to be more descriptive in the environment variables we use - such as:

  • TF_WORKSPACE
  • TF_IN_AUTOMATION
  • TF_SKIP_PROVIDER_VERIFY
  • TF_PLUGIN_CACHE_DIR

As such I think this is best as TF_PARALLEL for consistency with the rest of Terraform - given this is an opt-in setting; when setting this as an environment variable this can be set once and forgotten (like the Provider environment variables are - as such, I don't think adding a couple of extra characters to be readable is a bad thing).

// https://github.com/hashicorp/terraform/pull/16807
if os.Getenv(ParaTestEnvVar) != "" {
t.Parallel()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that not all of the tests can be run in parallel (see: Network Watcher, which can only have one instance deployed at a time..) - on reflection I think it's worth adding a RunInParallel variable to this too (defaulted to true - as in this PR) which is checked here and can be turned off per-test. The tests will still run synchronously by default since the TF_PARA environment variable won't be set.

Related: over time we're going to need to work through each Test Case and state the Azure Regions in which the Resource is supported (e.g. as an arbitrary example, such as don't test AKS in Germany since it's not available there) - I believe both of these changes can probably be done in batches; rather than in one giant PR (which should also reduce the severity of any conflicts)

Copy link
Contributor Author

@metacpp metacpp Dec 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, that's why I added similar flag in this PR, but since resource's test suite is widely used by many different providers, I set its default value be false since I don't want to break any legacy code. The current test cases in azurerm provider have very strong dependencies in resource.Test and resource.TestCase, we need to balance between large code changes and clean code.

For the acc tests in "resource_arm_network_watcher_test.go", I introduced a new function to by pass parallelism at case level.

For the each test case, we need to add its supported regions, definitely it's worth doing this. Let's talk in details later.

GNUmakefile Outdated
# TF_PARA=1 make testacc
# 2. Enable parallel run for acceptance tests with matching pattern.
# TF_PARA=1 TESTARGS="-run TestAccAzureRMStorage" make testacc
testacc: fmtcheck test-install
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should always be running the unit tests before the acceptance tests (otherwise, there's no point running them, since they could be invalid) - so can we remove test-install and integrate it back into the test step above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not the original author of test/testacc here. From their original definitions:
test: fmtcheck, test-install, unit-test
testacc: fmtcheck, acc-test.

What I changed here is just add an extra step, test-install, into testacc. Do you want me to include the unit test into acc test ? I checked the resource.Test, it will run unit tests by default, but will skip acc tests if TF_ACC is not set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I changed here is just add an extra step, test-install, into testacc.

I was thinking more of removing test-install from testacc - as it's not needed since we should be running make test first anyway (which will catch this). The benefit to having these split out is it's easier to expose what's failed (i.e. unit/acc tests), rather than just "some tests failed" - if that makes sense?

make test
make acctest

Do you want me to include the unit test into acc test ? I checked the resource.Test, it will run unit tests by default, but will skip acc tests if TF_ACC is not set.

No that's fine - if we can just remove test-install this should be fine :)

README.md Outdated

```sh
$ make testacc
$ [TF_PARL=1] [TF_PARL_NUM=20] [TESTARGS="-run TestAccAzureRMStorage"] make testacc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth splitting this into two separate examples - one as it was (which is the default/recommended) and a second which shows an example usage in parallel (as above) - given I think most contributors will end up running these individually for cost/debugging reasons?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parallel running will not make user pay more, the size of test package per run is the key factor, and user can use TESTARGS to filter out necessary cases to run.

I updated the README with 3 common usage cases, hope it's helpful to end user.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my point was more the user has the option to cancel them out; TESTARGS is a prefix match; and most tests follow the pattern:

  • foo_basic
  • foo_basicUpdated

meaning in order to run foo_basic using the TESTARGS of -run=foo_basic - the test foo_basicUpdate will also be run (leading to extra resources/costs) when in parallel, rather than being able to be cancelled via ctrl+c - it's not a big deal, but it's one to be aware of

@@ -14,7 +14,7 @@ func TestAccAzureRMNetworkWatcher_basic(t *testing.T) {
resourceGroup := "azurerm_network_watcher.test"
rInt := acctest.RandInt()
resource.Test(t, resource.TestCase{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than shoe-horning this into the testAccPreCheck method - I think it might be worth overriding/sub-classing resource.TestCase to expose a RunInParallel field as per the original PR - to match the design of the API? e.g.

resource.Test(t, resource.TestCase{
  PreCheck: func() {},
  Providers: testAccProviders,
  RunInParallel: false,
  CheckDestroy: #...
})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tombuildsstuff We don't want to change 600+ test cases now, that's one of the biggest concern since the beginning of this PR.

In another PR, we discussed the pain point of 600+ test cases, and @vancluever suggested me to change testAccPreCheck to inject parallelism logic, since this is the smallest change but achieving the goal in a very short time.

We do have plan to add a middle layer, aztesting, to extend functionalities of resource.Test, especially if Terraform core runtime team don't want to add some features into core. That will take even more time to finish the work, I suggest we finish this PR asap considering it's been pending for almost 2 weeks and we're blocked by it to enable Jenkins CI on our side as monitoring, and at the same time, we start another thread to do the code refactoring.

GNUmakefile Outdated
# TF_PARA=1 make testacc
# 2. Enable parallel run for acceptance tests with matching pattern.
# TF_PARA=1 TESTARGS="-run TestAccAzureRMStorage" make testacc
testacc: fmtcheck test-install
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I changed here is just add an extra step, test-install, into testacc.

I was thinking more of removing test-install from testacc - as it's not needed since we should be running make test first anyway (which will catch this). The benefit to having these split out is it's easier to expose what's failed (i.e. unit/acc tests), rather than just "some tests failed" - if that makes sense?

make test
make acctest

Do you want me to include the unit test into acc test ? I checked the resource.Test, it will run unit tests by default, but will skip acc tests if TF_ACC is not set.

No that's fine - if we can just remove test-install this should be fine :)

Copy link
Contributor

@tombuildsstuff tombuildsstuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @metacpp

thanks for pushing the latest updates - I've left some comments in-line

thanks!


// If "TF_PARL" is set to some non-empty value, all the parallel test cases
// will be run in parallel. It requires that the test cases are all
// independent ones without any shared resource before turn it on.
Copy link
Contributor

@tombuildsstuff tombuildsstuff Dec 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue the second sentence in this comment is incorrect since it's not the reason why we can't run these tests in Parallel (all test resources are unique per test) - which are mostly due to limitations in Azure (either Rate Limiting, or Usage Limits [such as only a single instance in a region w/Network Watcher])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, updated the comments and the code for better reading.

// example usage:
// TF_ACC=1 TF_PARL=1 go test [TEST] [TESTARGS] -v -parallel=n
// To run specified cases in parallel, please refer to below PR:
// https://github.com/hashicorp/terraform/pull/16807
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need these 4 lines, since they're included in the README?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different levels of details. README is based on Makefile while this is based on go test.

@tombuildsstuff
Copy link
Contributor

After some investigation and discussing this internally we've discovered this approach won't work without some major refactoring due to the way that logging works (for both HTTP Requests/Responses and log.Printf's). Instead for the moment we're opting to build a runner similar to the one we use for TeamCity but for JUnit to be able to output the logs as needed in both Jenkins and VSTS/TFS.

Whilst hashicorp/terraform#16356 goes some way towards solving this - it's only a band-aid over the problem until Parallelisation is supported natively (since it's writing them to external files per test, rather than within each test blocks) - once this is supported we can look to using the native parallelisation support (once logging is tracked on a per-test basis).

As such I'm going to close this PR - but thanks for this contribution!

@ghost
Copy link

ghost commented Mar 30, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants