Skip to content

Conversation

@ktoso
Copy link
Member

@ktoso ktoso commented Aug 9, 2022

This replaces our previous "bunch of shell scripts" integration tests.

Resolves #900

I actually found a bug while doing this so will solve #1054 while doing this.

Ignore the ad-hoc JSON Coders here, those were to debug the issue.

This introduces a new way to write multi node tests which can span actual processes and automatically join a cluster.
We can aggressively KILL those processes and assert on the outputs of such clusters.

We will also easily be able to deploy tests written using this infra to multiple actual physical nodes or docker containers -- similar to how Akka's multi-jvm tests were doing way back then. This will allow us to verify on real networks etc.

It also is amazing for reproducers -- we can exactly replicate behavior, without having to do the weird "make sure we resolve as remote" and other dances.

Screenshot just FYI how an output looks like -- speaking for myself, I can't get complicated things solved without such reliable test infra, so I'm more than happy it is back!

Screenshot 2022-08-09 at 18 07 52

Running tests is done via swift package --disable-sandbox multi-node -c debug test (or just swift package --disable-sandbox multi-node test to run in -c release mode). The plugin automatically compiles and runs tests in individual processes.


This is how an example test-case looks like:

import DistributedActors
import MultiNodeTestKit

public final class ClusterCrashMultiNodeTests: MultiNodeTestSuite {
    public init() {}

    /// Spawns two nodes: first and second, and forms a cluster with them.
    ///
    /// ## Default execution
    /// Unlike normal unit tests, each node is spawned in a separate process,
    /// allowing is to kill nodes harshly by killing entire processes.
    ///
    /// It also eliminates the possibility of "cheating" and a node peeking
    /// at shared state, since the nodes are properly isolated as if in a real cluster.
    ///
    /// ## Distributed execution
    /// To execute the same test across different physical nodes pass a list of
    /// nodes to use when running the test, e.g.
    ///
    /// ```
    /// swift package multi-node test --deploy 192.168.0.101:22,192.168.0.102:22,192.168.0.103:22 // TODO
    /// ```
    ///
    /// Which will evenly spread the test nodes across the passed physical worker nodes.
    /// Actual network will be used, and it remains possible to kill off nodes and logs
    /// from all nodes are gathered automatically upon test failures.
    public enum Nodes: String, MultiNodeNodes {
        case first
        case second
    }

    public static func configureMultiNodeTest(settings: inout MultiNodeTestSettings) {
        settings.initialJoinTimeout = .seconds(5)
        settings.dumpNodeLogs = .always

        settings.installPrettyLogger = false
    }

    public static func configureActorSystem(settings: inout ClusterSystemSettings) {
        settings.logging.logLevel = .debug
    }

    public let testCrashSecondNode = MultiNodeTest(ClusterCrashMultiNodeTests.self) { multiNode in
        // A checkPoint suspends until all nodes have reached it, and then all nodes resume execution.
        try await multiNode.checkPoint("initial")

        // We can execute code only on a specific node:
        try await multiNode.on(.second) { second in
            try second.shutdown()
            return     
        }

        try await multiNode.runOn(.first) { first in
            try await first.cluster.waitFor(multiNode[.second], .down, within: .seconds(10))
        }
    }
}

@yim-lee
Copy link
Member

yim-lee commented Aug 9, 2022

Added CI pipeline to run integration tests.

@swift-server-bot test this please

@ktoso ktoso force-pushed the wip-multi-node-integration-test-infra branch 7 times, most recently from 0b0645f to afdd865 Compare August 15, 2022 11:59
@ktoso ktoso force-pushed the wip-multi-node-integration-test-infra branch from 3d8a323 to ed1d648 Compare August 16, 2022 06:16
@ktoso ktoso force-pushed the wip-multi-node-integration-test-infra branch from 1407dfc to 55fab7c Compare August 16, 2022 09:01
@ktoso ktoso force-pushed the wip-multi-node-integration-test-infra branch from 8ad5312 to e47c16c Compare August 16, 2022 12:07
@ktoso
Copy link
Member Author

ktoso commented Aug 16, 2022

Uncovering bugs in receptionist rewrite where ordering wasn't quite right anymore resulting in test hangs (and bad receptionist ordering bugs in the op-log).

Mostly stable locally but still stabilizing tests while here...

extension OrderedSet {
// FIXME(collections): implemented for real in https://github.com/apple/swift-collections/pull/159
@inlinable
internal func _filter(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing operation in OrderedSet; upstreaming better impl: apple/swift-collections#159

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the uncheckedUniqueElements initializer would generally give better performance here:

let members = self.filter(isIncluded)
return OrderedSet(uncheckedUniqueElements: members)

This prevents multiple rehashings of the result while it is being constructed. Copying the contents twice would still be slower than what the package can do, but reducing hash operations would still be a considerable boost!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for the hint!

@ktoso ktoso force-pushed the wip-multi-node-integration-test-infra branch from 65f9497 to 7f7b3a4 Compare August 17, 2022 05:28
@ktoso ktoso force-pushed the wip-multi-node-integration-test-infra branch from 7f7b3a4 to d872d20 Compare August 17, 2022 05:35
@ktoso ktoso force-pushed the wip-multi-node-integration-test-infra branch from 0de5560 to 4645dfe Compare August 17, 2022 09:20
@ktoso ktoso enabled auto-merge (rebase) August 17, 2022 10:03
@ktoso ktoso disabled auto-merge August 17, 2022 10:03
@ktoso ktoso merged commit 45ebdf0 into apple:main Aug 17, 2022
@ktoso ktoso deleted the wip-multi-node-integration-test-infra branch August 17, 2022 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Re-Enable integration tests

3 participants