test, integration: Add an integration test framework by purpleidea · Pull Request #368 · purpleidea/mgmt

purpleidea · 2018-03-09T10:08:59Z

This adds an initial implementation of an integration test framework for
writing more complicated tests. In particular this also makes some small
additions to the mgmt core so that testing is easier.

purpleidea · 2018-03-09T10:32:59Z

@aequitas Hey,

So I took some time to hack out integration testing how I was imagining it. I initially started with a thorough review of your code in #353, but then I changed it around to have what is hopefully an easier API, and one that works with events without time.Sleep anywhere.

I hope you don't mind, and hopefully you like it. Please review if you'd like, and I can merge and you can use this to add your magnificent tests on.

I'm also working on a clustered version of this-- you can see the beginnings of this in the seeds parameter. Lastly, you'll see a few "patterns" in the patterns.go file, which I think we can use to add more general patterns, and then re-use them as much as we want, in particular as seen in basic_test.go

Thanks!

aequitas · 2018-03-09T11:50:44Z

@purpleidea I'm glad you finally got around to this and I have some understanding for the time it took. But I'm really frustrated and disappointed by the point we are at now. All the time and effort I spent on constructing and asking for feedback, only getting vague discussions, iterating over it, you're refusal to give proper feedback/discussion due to principal reasons, having me ask outsiders, incorporating their advice and you still abandoning that.

I admit the initial version of my implementation is not complete nor perfect. But I wanted to have something to work with and iterate on the basic principle before draining time into fully expanding all features like proper waits instead of sleep, structs and common patterns. For example a test like this https://github.com/purpleidea/mgmt/pull/365/files#diff-a7d4a5bd50bf5e859cbd3bb62e51b65cR75 already deviated heavily from the standard pattern. And this might not be just an exception. Causing a to simple API to create a high complexity in the testcases themselves making them useless imho as they overshadow the complexity of the thing they try to test.

All in all I'm left bitter atm regarding contributing to this project. I'll see if this feeling fades away in the upcoming days and if I feel enticed to leave the comments I have on this PR.

purpleidea · 2018-03-09T19:40:33Z

@aequitas I'm sorry about that. My goal wasn't to make you feel bitter, sorry. I would have liked to spend more time on iterating on your branch, but I realized with the amount of changes to get the converge detection working with mgmt core (as I patched here) was too much time.

I also felt bad because I knew the lack of integration testing was blocking you, and it was something we needed for the project anyways. As a result, I decided to just write it "as an example", rather than leaving many review comments on your PR. I didn't merge it yet, because my goal wasn't to make you feel bad about your PR, but rather have something to look at, compare, and discuss.

If you think there are changes we should make to this before I merge it, please let me know.
Hope to have you back and hacking on things.

Please understand that it's a big challenge to run such a project on my own time, and not all the decisions I take are perfect. Thanks for reading!

This adds an initial implementation of an integration test framework for writing more complicated tests. In particular this also makes some small additions to the mgmt core so that testing is easier.

This further extends the integration framework to add some simple primitives for building clusters. More complex primitives and patterns can be added in the future, but this should serve the general cases.

This adds logging so that you can dig deeper into crashes or issues.

purpleidea · 2018-03-10T08:54:47Z

I've updated the branch with some additions, including a small cluster executor. It all works, however it does hit some occasional intermittent failures which are due to bugs elsewhere in the code. Those are also sometimes experienced by the shell tests, and I hope to resolve all of these in the future :)

aequitas · 2018-03-10T09:56:09Z

@purpleidea It doesn't feel like an example to me, more like your implementation of the same. Instead of providing feedback on what I've spend time on and commenting on the remarks I left in code and in the PR, it feels to me like you just shove it aside with your version. I already had idea's for convergence checking and I had clustering all worked out, only I removed that last part as I ran into the same issue in that it requires some bugfixing in core before it's stable. I just wanted to know if you agreed with the general layout and API before sinking even more time into this. But you refused to even look at it until I refactored it.

I will put this aside for the weekend and come back on this beginning next week.

purpleidea · 2018-03-10T09:58:38Z

@aequitas Sorry again. I just have a lot on my plate and wish I could have handled it better. Hope to have you happy and hacking again soon.

Thanks,
James

purpleidea · 2018-03-13T10:39:42Z

I haven't heard anything for a while, so I'm going to merge this for now.
Happy to change it down the road if necessary.

aequitas

For what it's worth here is my partial review.

aequitas · 2018-03-13T10:22:06Z

integration/basic_test.go

+		}
+		return
+	}
+	d := m.Dir()


Within a testcase there is really no benefit to having 9 lines of code to just verify a file exists (you're not even verifying the content yet, it could be empty, testcase would false positive). All it does is make the testcase harder to interpret and increase the cognitive load. This can just be a simple single line assertion and it's a common enough one that it can be added to patterns or the instance.

aequitas · 2018-03-13T10:27:08Z

integration/basic_test.go

+	}
+	`
+	m := Instance{
+		Hostname: "h1", // arbitrary


I would just drop these two default parameters, like you mention the hostname is arbitrary, why not initialise them in the generic pattern or instance?

Also is there any reason to preserve the result if the test passes or to not save the result if the test fails? Tests should be setup to run in (os level) temporary directories anyways, where garbage collection is common, no need to add high level management of temporary files to the suite itself imho. Unless they really can create enormous amounts of data quickly (not like this case a few directories and kb of files).

aequitas · 2018-03-13T10:30:27Z

integration/basic_test.go

+	type test struct { // an individual test
+		name   string
+		code   string // mcl code
+		fail   bool


please document the intention of this boolean, it saved having to dive into the code logic, something like: expect the code deployment to fail

aequitas · 2018-03-13T10:34:30Z

integration/basic_test.go

+	}
+}
+
+func TestInstance1(t *testing.T) {


Given this is an table driven test it is expected that a certain 'class' of tests are run in this testcase. TestInstance1 does not describe this very well. A better name would be something like: TestSimpleDeployments also with a proper godoc.

aequitas · 2018-03-13T10:36:58Z

integration/basic_test.go

+	values := []test{}
+
+	{
+		code := Code(`


This is a similar approach that I had initially envisioned. I didn't spend time yet constructing this as I first wanted to know which common patterns emerge. Some brainstorming I did on this topic is #353 (comment) Where the environment setup (instances/cluster) and validations are generalised so they can be reused across tests.

aequitas · 2018-03-13T11:21:33Z

integration/instance.go

+	if len(seeds) == 0 {
+		// set so that Deploy can know where to connect
+		// also set so that we can allow others to find us and connect
+		obj.clientURL = "http://127.0.0.1:2379"


I've had many times where race conditions causes tests to fail as old mgmt instances where occupying ports that where assumed to be free by the testsuite. A solution I was thinking about is using tcp dial on a 0 port to get a available port from the operating system, like: https://github.com/phayes/freeport/blob/master/freeport.go#L9.

Ideally instead of tcp, unix domain sockets would be used during test to completely isolate tests and allow for parallel runs. This of course has to wait for the implementation.

Agreed! I just did it this way since it's the only way I knew. For now it seems to not conflict, but if we experience issue in the future, we can change it.

aequitas · 2018-03-13T11:27:57Z

integration/instance.go

+		}
+		s := fmt.Sprintf("--seeds=%s", urls[0])
+		// TODO: we could just add all the seeds instead...
+		//s := fmt.Sprintf("--seeds=%s", strings.Join(urls, ","))


Since we will be testing the behaviour of how instances in cluster behave given different configurations of seeds either way can be expected to work the same. Is there any (speed) benefit to giving all seeds?

I don't think there's a performance difference, in general it's preferable to connect with more seeds in case one happens to go down before you get the full list. Once you connect to anyone, you get the full list. I left it this way since when I refactor etcd/etcd.go I want to make sure there's no bug here, and I wanted to remind myself to at least see if there's a need to test both or not.

aequitas · 2018-03-13T11:28:25Z

integration/instance.go

+	}
+
+	// cause a stack dump first if we can
+	_ = obj.cmd.Process.Signal(syscall.SIGQUIT)


aequitas · 2018-03-13T11:35:14Z

integration/instance.go

+// the `--converged-status-file` option which can be used to track the varying
+// convergence status.
+func (obj *Instance) Wait(ctx context.Context) error {
+	//if obj.cmd == nil { // TODO: should we include this?


aequitas · 2018-03-13T11:37:07Z

integration/instance.go

+// `--converged-timeout` option with mgmt for this to work. It tracks this via
+// the `--converged-status-file` option which can be used to track the varying
+// convergence status.
+func (obj *Instance) Wait(ctx context.Context) error {


how does this handle race conditions? where mgmt instance/cluster is (partly) converged but Wait is not called?

What do you mean?

You need to set the converged-timeout to a high value, making waiting for it kinda useless as a sleep would almost suffice as well. (a low timeout is preferred to have faster tests). But with a to low value you have the risk for a configuration to converge after you deploy but before the moment you start waiting for convergence.

Ideally you would wait for a specific deploy ID to converge instead of just any sign of convergence.

aequitas · 2018-03-13T11:52:05Z

I'm done with this project for now.

Clearly this feature was not urgent to merge. But you went ahead and merged it without asking if I would still leave comments as I promised I would come back on. If you don't have the patience to wait at least a few days where I waited weeks and spent effort and time to at least have you look and discuss the general principle. In chat you stated you would like people to review your code but as it turns out you rather just merge your own code as soon as you can.

You scream for contributors to this project but I have the feeling you only see them as code monkeys having to write the code how you would write it, instead of making them part of a team that together builds on something great. I can bare most of your nitpicking and the principles you adhere to for this project. You probably mean well and you have good ideas, but the way you act as lead for this project just frustrates me and doesn't make it seem the effort for me to contribute any further.

purpleidea · 2018-03-13T12:00:01Z

@aequitas

I'm done with this project for now.

Sorry to see you go. You're welcome back anytime. I've just got a lot that I'm trying to balance right now, and sorry it didn't all work out in the perfect way.

Happy Hacking.

aequitas · 2018-03-13T13:25:43Z

@purpleidea

Thanks for feedback on some of the comments I made.

I closed all my open PR's as most of them require work to adapt to your implementation or have been lingering for to long without conclusion.

I've just got a lot that I'm trying to balance right now, and sorry it didn't all work out in the perfect way.

I understand your situation but if I may be brutally honest you should really consider how to best spend the time you allocate to this project. To me it just feels that rewriting an entire implementation as example isn't the most efficient way. I think you are a great coder and the standards set for this project are high, which can be a good thing. But if you don't want to end up being the only one having to write all the code you should really consider being more lenient sometimes. Also, and I know your busy and it's your project so I really shouldn't complain, but it seems to me you adhere more strict rules for contributors then for your own commits. Maybe ask other frequent contributors if they like to chime in on reviews so it's not only your voice that does all the complaining.

I set out contributing to mgmt with the idea to answer the question 'does mgmt run in production' with a confident 'yes, and its everything you can wish for'. Meaning I would tackle every problem, small or big, that I encountered to have it fit what I consider a production ready tool that I confidently can 'sell' to co-workers as I don't have warn for sharp edges. But right now the velocity I need to work on this fulltime can't be supported, which creates friction and frustration. That together with how this PR has been handled have made me decide that, for my own sanity, I won't be contributing for a while.

purpleidea · 2018-03-13T13:42:31Z

To me it just feels that rewriting an entire implementation as example isn't the most efficient way.

There is technical debt which I introduced when I was less familiar with golang, which is causing bugs and intermittent test failures, along with #322 which much be fixed before the project can be considered stable.

I think you are a great coder and the standards set for this project are high, which can be a good thing. But if you don't want to end up being the only one having to write all the code you should really consider being more lenient sometimes.

No. I'm not taking on any more technical debt that I can't easily patch out. An integration test framework is critical if a lot of tests are added on top of it, because if it ever needs fixing, I either have to nuke all the tests or spend far too much time trying to fix it.

Also, and I know your busy and it's your project so I really shouldn't complain, but it seems to me you adhere more strict rules for contributors then for your own commits.

Possibly. But the difference is that I'm ultimately responsible for both my code and contributors, so it has to be right.

Maybe ask other frequent contributors if they like to chime in on reviews so it's not only your voice that does all the complaining.

Others are welcome and encouraged too if their feedback is correct and productive.

aequitas · 2018-03-13T17:10:30Z

No. I'm not taking on any more technical debt that I can't easily patch out. An integration test framework is critical if a lot of tests are added on top of it, because if it ever needs fixing, I either have to nuke all the tests or spend far too much time trying to fix it.

I'm not saying you should open the floodgates, thats the other extreme, just to accept that your way might not be the only correct one and trust the loyalty of others to come back and iterate. Which I will admit is a hard thing to do with strangers over the internet that can decide to disconnect from this project at any moment and leave you with the broken pieces.

I would be glad to help with chores and fixing technical debt not related to my immediate goals just as a service to this project. But for that I would need the feeling to be part of a team, right now I don't feel compelled as I have the feeling it just you and only your opinion micromanaging the project and it reminds me of myself in former projects.

I honestly believe your intentions and values are good and you strive for openness, but I think you could do a more active role in delegating and involving people into collaborating on the project together with you instead of pulling everything towards yourself. The mgmtlove issues are a good start, but if it ends up with just only you reviewing them and a contributor walking away half way there is nothing gained. Ideally there would be a team of core enthousiast that carry the project, share the chores, discuss broader subjects and form opinions. This also makes the project easier to 'sell' as production ready to colleagues as right now it's seems just you and some on/off contributors and that's a mighty low bus factor. I'm not involved that long in the project (just weeks), so my view might be completely distorted, correct me if I'm wrong.

In the end it is your project and your codebase, every decision is yours to make. This is just my opinion and you may take it at the value it is worth to you.

One last note, the reason I'm so emotional about this and keep carrying on is because I've come believe in mgmt and dare I say it have fallen in love with it. But the friction, frustration and realisations of the past days made me sad in that this project might never get the tractions it needs to consider it a solution I can actually can get implemented somewhere, which forces me to take a binary decision for now.

purpleidea · 2018-03-13T17:27:30Z

@aequitas I've already apologized multiples times publicly, in channel and DM. I don't think I can improve upon this anymore.

I'm not a perfect maintainer, and it's very difficult me to review patches which are substantially far away from what I'm expecting in terms of design and code quality. It's a skill I need to improve on. To be clear, it's not because I think you're not bright or a good coder, you're obviously both, but you're very new to golang, and at this early stage, it's my mistake for not immediately saying "please start with easier/smaller patches first".

One last note, the reason I'm so emotional about this and keep carrying on is because I've come believe in mgmt and dare I say it have fallen in love with it.

Well please come back when you want, but if so, then you need to not get offended so easily here.

Thanks

purpleidea force-pushed the feat/integration branch 2 times, most recently from 7794758 to 60ddbfa Compare March 9, 2018 10:32

purpleidea force-pushed the feat/integration branch 4 times, most recently from 4d23f0f to aed23bc Compare March 10, 2018 03:17

purpleidea added 3 commits March 10, 2018 01:23

test, integration: Add an integration test framework

2c0aa44

This adds an initial implementation of an integration test framework for writing more complicated tests. In particular this also makes some small additions to the mgmt core so that testing is easier.

test, integration: Add cluster primitives to integration framework

1baf954

This further extends the integration framework to add some simple primitives for building clusters. More complex primitives and patterns can be added in the future, but this should serve the general cases.

integration: Log the instance output

3593325

This adds logging so that you can dig deeper into crashes or issues.

purpleidea force-pushed the feat/integration branch from 86da62f to 3593325 Compare March 10, 2018 06:24

purpleidea closed this Mar 13, 2018

purpleidea deleted the feat/integration branch March 13, 2018 10:39

aequitas reviewed Mar 13, 2018

View reviewed changes

Uh oh!

Conversation

purpleidea commented Mar 9, 2018

Uh oh!

purpleidea commented Mar 9, 2018

Uh oh!

aequitas commented Mar 9, 2018

Uh oh!

purpleidea commented Mar 9, 2018

Uh oh!

purpleidea commented Mar 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aequitas commented Mar 10, 2018

Uh oh!

purpleidea commented Mar 10, 2018

Uh oh!

purpleidea commented Mar 13, 2018

Uh oh!

aequitas left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aequitas commented Mar 13, 2018

Uh oh!

purpleidea commented Mar 13, 2018

Uh oh!

aequitas commented Mar 13, 2018

Uh oh!

purpleidea commented Mar 13, 2018

Uh oh!

aequitas commented Mar 13, 2018

Uh oh!

purpleidea commented Mar 13, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

purpleidea commented Mar 10, 2018 •

edited

Loading