Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does go test run ginkgo tests parallel automatically? #111

Closed
AlexLuya opened this issue Oct 10, 2014 · 7 comments
Closed

Does go test run ginkgo tests parallel automatically? #111

AlexLuya opened this issue Oct 10, 2014 · 7 comments

Comments

@AlexLuya
Copy link

I have a project with structure like this

  src\
      db\   create and holds a global: var session *gorethink.Session that holds connections to rethinkdb(similar to mongodb) 
      create\   contains a CreateDao.go that use db.session to insert record to db
      retrieve\ contains a RetrieveDao.go that use db.session to retrieve record from db 
      delete\   contains a DeleteDao.go that use db.session to delete record in db
      toanother\  contains a toAnotherUpdateDao.go to update one kind of records in db and corresponded test file:toAnotherUpdateDao_test.go with skeleton as below
      tomaster\  contains a toMasterUpdateDao.go to update another kind of records in db and corresponded test file:toMasterUpdateDao_test.go with skeleton as below

skeleton of toAnotherUpdateDao_test.go and toMasterUpdateDao_test.go

var _ = Describe("....", func() {
var (
    create model
)

Context("......", func() {
    It("should NOT return any errors", func() {
        //when 
        err := save model to rethinkdb by CreateDao
        //then
        Expect(err).NotTo(HaveOccurred())

        //when 
        err = update record in rethinkdb by toAnotherUpdateDao or toMasterUpdateDao
        //then
        Expect(err).NotTo(HaveOccurred())

        //when 
        fromDB, err := retrieve record from db by RetrieveDao
        //then
        Expect(err).NotTo(HaveOccurred())
        Expect(fromDB).To(Equal(model))
    })
})

AfterEach(func() {
          clear table(delete all records) by DeleteDao
})
})

Problem is:

if I go to toanother and tomaster respectively to run "go test",both succeed
but to I up to src dir then run "go test ./...",and I got two errors respectively,one tell me update failed,the another tell me retrieve failed
So,I doubt that tests didn't get run serially by parallelly caused this problem,is that true?
@onsi
Copy link
Owner

onsi commented Oct 10, 2014

Hey @AlexLuya

Yes, go test ./... tries to run tests in parallel. This is terrible in my opinion. It leads to the sorts of issues that you're seeing.

I'd strongly recommend you use the ginkgo cli. It's a thin wrapper around go test, it's very easy to install (go install github.com/onsi/ginkgo/ginkgo), and it manages how and when tests run much more sanely. It also has a ton of functionality that you can read about in the docs.

ginkgo -r will run all the test suites under the current directory (which is what you are trying to do with go test ./...) and it will do so serially.

Ginkgo supports a different model of parallelization than go test. Instead of allowing different test suites to run simultaneously, ginkgo -r -p will continue to run one test suite at a time. However -p will instruct Ginkgo to launch multiple test processes the suite, each process will be given a subset of the tests and those subsets will run in parallel. This can vastly speed up large integration suites, though you have to do work to ensure that the suite is parallelizable. This is usually a matter of making sure that any external resources accessed by the suite are sharded by the parallel node number, which Ginkgo provides via GinkgoParallelNode(). In the case of a database I usually spin up a new database instance for each node and point the node at said instance. The docs cover this in detail here (though they're slightly out of date, you can use GinkgoParallelNode() instead of config.GinkgoConfig.ParallelNode -- I'll fix that soon).

@AlexLuya
Copy link
Author

thanks,I will try it.

@sheeley
Copy link

sheeley commented Nov 10, 2016

@onsi can you share more context of why parallel test suites are terrible? if anything, i'd expect to hear that parallel tests within a suite are terrible, not that parallel suites are terrible. interested in helping to get suites running in parallel or understand why our tests should be parallel instead of suites. thanks!

@onsi
Copy link
Owner

onsi commented Nov 11, 2016

"terrible" is probably way too strong.

running different test suites (especially integration test suites) that might share some sort of global resource (e.g. both pointed at an external cluster, or an external resource set up on the box) could lead to conflicts and surprising/difficult to reproduce test results. running unit tests in parallel this way is arguably fine.

tests running in parallel within a suite is also problematic - same reason. any global state (or accidental global state - and this can crop up easily) could be a source of random difficult to understand error.

i chose a very particular approach with ginkgo that gives the test author a bit more context and control. when you run a suite in parallel ginkgo launches multiple processes and shards the tests across those processes making available the current parallel node. this lets you manage any shared resources and shard your usage of said resources deterministically. (Ginkgo was built to help test drive Cloud Foundry and this particular strategy is super helpful for our many integration suites).

there's also some fancy behavior around SynchronizedBeforeSuite that lets the test author spin up shared singleton resources for use by the test suite just once and then tear down the singleton resource at the end... much more context in the docs here

but in all fairness...
😜 who even knows what I was trying to convey 2 years ago?!

@sheeley
Copy link

sheeley commented Nov 15, 2016

@onsi thanks for the response - I'm actually pretty interested in being able to run suites in parallel, using something like https://github.com/codahale/testdb to allow them to function completely separately. this way it's very easy/fast to say:

  • create new db with only the data I need for entityA
  • run test suite for entityA crud/business logic
  • tear db down

and

  • create new db with only the data I need for entityB
  • run test suite for entityB crud/business logic
  • tear db down

can both run totally in parallel. Any suggestions for that?

@onsi
Copy link
Owner

onsi commented Nov 15, 2016

Would entityA and entityB share the same test suite? Or different test suites? What makes more semantic sense for your project?

If they can be in the same suite then I encourage you to look at Ginkgo's parallelization techniques. In cases like this I've seen the following pattern work:

  • Use SynchronizedBeforeSuite to spin up one (shared) data store (e.g. mysql, etcd, etc...)
  • entityA and entityB each have their own GInkgo Describe block.
  • this Describe block has a BeforeEach to create a database/table/shard for just that entity. To avoid collisions deterministically you can use GinkgoParallelNode() to get the current parallel shard
  • the Describe block also has an AfterEach that performs clean up of the database/table/shard.

Another approach is to use BeforeSuite to spin up a data store per parallel node - use GinkgoParallelNode() to compute unique ports for each store. This gives you stronger isolation but can have a performance impact (hello 8 MySQL servers).

Both approaches work and allow you to use ginkgo -p to automatically shard your test into multiple nodes. The nice thing here is you can choose the degree of parallelization with ginkgo -numNodes=N (-p is shorthand for N = number-of-cores). If you're running on a beefy VM set N=24. If you're running on a macbook air set N=2.


If entityA and entityB need to be in separate suites (e.g. they are in separate folders and therefore packages) you have two options:

  • make them one suite anyway. you can do this by having the tests live in their own packages and just import entityA and entityB. There are other more elaborate ways to accomplish this that I can point you at if you are interested.
  • keep them separate, make sure your tests can run in tandem, and then write some bash to launch multiple ginkgo's at once.

FWIW I've never seen anyone do the latter but it should be possible.

@robdimsdale
Copy link
Contributor

I was going to suggest wrapping multiple processes in bash. I've seen it done before, but it's gross. Handling the processes and the output gets tricky quickly. You could wrap the gingko executable calls in GNU Parallel if it supports different args for each invocation (I'm not sure about this).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants