The Platform Checks framework is "write once, run everywhere" - a single piece of test code that validates the operation of a particular feature is written once and then have the framework takes care to execute it in multiple scenarios, such as while restarting or upgrading pieces of the Materialize platform.
The Checks framework is mzcompose
-based, with the test content expressed in testdrive
fragments.
In the context of this framework:
Check
is an individual validation testScenario
is the context in which it will be run, such as upgrade.Action
is an individual step that happens during aScenario
, such as stopping a particular container
bin/mzcompose --find platform-checks run default --scenario=SCENARIO [--check=CHECK] [--execution-mode= [--execution-mode={alltogether,oneatatime}]
The list of Checks available can be found here
The list of Scenarios is here
The list of available Actions for use in Scenarios is here
In execution mode altogether
(the default), all Checks are run against a single Mz instance. This means more "stuff" happens
against that single instance, which has the potential for exposing various bugs that do not happen in isolation. At the same time,
a smaller number of adverse events can be injected in the test per unit of time.
In execution mode oneatatime
, each Check is run in isolation against a completely fresh Mz instance. This allows more events to happen
per unit of time, but any issues that require greater load or variety of database objects will not show up.
If you experience a failure in the CI, the scenario that is being run is listed towards the top of the log:
Running `mzcompose run default --scenario=RestartEnvironmentdClusterdStorage`
Immediately before the failure, the Check that is being run is reported:
Running validate() from <materialize.checks.threshold.Threshold object at 0x7f982d1e2950>
You can check if the failure is reproducible in isolation by running just the Check in question against just the Scenario in question:
./mzcompose run default --scenario=RestartEnvironmentdClusterdStorage --check=Threshold
Sometimes, if a Mz container is unable to start, the check where the failure is reported may not be the one that have caused it, it
may be just the first one to attempt to access clusterd
s that are no longer running.
You can also check if the failure is related to restarts or upgrades in general by trying the "no-op" scenario that does not perform any of those.
./mzcompose run default --scenario=NoRestartNoUpgrade --check=...
When functionality is removed, added or changed, we should still keep testing the old syntax in older versions, which is especially relevant during an upgrade test. As an example consider this testdrive fragment:
> CREATE SOURCE shared_cluster_storage_first_source
IN CLUSTER shared_cluster_storage_first
FROM LOAD GENERATOR COUNTER (SCALE FACTOR 0.01)
We want to remove support for passing the SCALE FACTOR
since it does not have an effect anyway.
Naively we might want to change the testdrive fragment like this, simply removing the SCALE FACTOR:
> CREATE SOURCE shared_cluster_storage_first_source
IN CLUSTER shared_cluster_storage_first
FROM LOAD GENERATOR COUNTER
This makes the test case green, but has the unfortunate side effect that we don't test the migration of a load generator with scale factor property during an upgrade. Instead we should use the new syntax for newer versions of Materialize, and keep using the old syntax for older versions (see Testdrive documentation):
>[version<9200] CREATE SOURCE shared_cluster_storage_first_source
IN CLUSTER shared_cluster_storage_first
FROM LOAD GENERATOR COUNTER (SCALE FACTOR 0.01)
>[version>=9200] CREATE SOURCE shared_cluster_storage_first_source
IN CLUSTER shared_cluster_storage_first
FROM LOAD GENERATOR COUNTER
In this case the cutoff is version v0.92.0-dev, which is the current version of Materialize in development.
A check is a class deriving from Check
that implements the following methods:
Returns a single Testdrive
fragment that is used to perform preparatory work in the check. This usually means creation of any
helper database objects that are separate from the feature under test, as well as creating the first instance of the feature being tested.
For example:
class Rollback(Check):
def initialize(self) -> Testdrive:
return Testdrive(
dedent(
"""
> CREATE TABLE rollback_table (f1 INTEGER);
> INSERT INTO rollback_table VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);
"""
)
)
manipulate()
needs to return a list of two Testdrive
fragments that further manipulate the object being tested. In this context,
manipulation means ingesting more data into the object, ALTER
-ing the object in some way, or creating more objects, including derived ones.
See the Tips section below for more information on writing test content for the manipulate()
section.
The validate()
section is run one or more times during the test in order to validate the operation of the feature under test. It is always
run after all initialize()
and manipulate()
have run, so it should check that all actions and data ingestions that happened during
those sections have been properly processed by the database.
The validate()
section may be run more than once so it needs to be coded defensively. Any database objects created in this section
must either be TEMPORARY
or be explicitly dropped at the end of the section.
All checks are located in the misc/python/materialize/checks/all_checks
directory, functionally grouped in files. A Check
that performs
the creation of a particular type of resource is usually placed in the same file as the Check
that validates the deletion of the
same resource type.
To ignore a Check
, annotate it with @disabled(ignore_reason="due to #...")
.
If a check performs non-idempotent actions against third-party services, such as ingesting non-UPSERT data into a
Kafka or Postgres source, it needs to be annotated with @external_idempotence(False)
. This Check will not be run
in Scenarios, such as some Backup+Restore scenarios, that may need to run a manipulate()
phase twice.
A Scenario is a list of sequential Actions that the framework will perform one after another:
class RestartEntireMz(Scenario):
def actions(self) -> List[Action]:
return [
StartMz(),
Initialize(self.checks),
RestartMzAction(),
Manipulate(self.checks, phase=1),
RestartMzAction(),
Manipulate(self.checks, phase=2),
RestartMzAction(),
Validate(self.checks),
]
A Scenario always contains 5 mandatory steps -- starting a Mz instance, and the exection of the initialization,
manipulation (twice) and validation of all participating Checks. Any Actions that restart or upgrade containers
are then interspersed between those steps. Two Manipulate
sections are run so that more complex, drawn-out
upgrade scenarios can be tested while ensuring that database activity happens during every stage of the upgrade.
The list of available Actions is here.
An Action is basically a short mzcompose
fragment that operates on the Materialize instance:
class DropCreateDefaultReplica(Action):
def execute(self, c: Composition) -> None:
c.sql(
"""
ALTER CLUSTER quickstart SET (REPLICATION FACTOR 0);
ALTER CLUSTER quickstart SET (SIZE '1', REPLICATION FACTOR 1);
"""
)
The Action's execute()
method gets passed a Composition
object, so that the Action can perform any operation
against the mzcompose composition. The methods of the Composition
class are listed here.
A good Check
will attempt to exercise the full breadth of SQL syntax in order to make sure that the system catalog contains objects
with all sorts of definitions that we will then expect to survive and upgrade and properly operate afterwards.
For example, for database objects with a WITH
SQL clause, every possible element that could be put in the WITH
list should be exercised.
When writing Checks, consider the network interactions between the individual parts of Materialize and make sure they are as comprehensively exercised as possible.
For objects that are serialized over the network, such as query plans, make sure that all relevant Protobuf messages (and nested or optional parts thereof) will be generated and transmitted.
For objects that involve ingestion or sinking, make sure that plenty of data will flow. Insert or ingest additional rows of data throughout
the manipulate()
steps.
A good Check
for a particular database object creates more than one instance of that object and makes sure that the separate instances
are meaningfully different. For example, for materialized views, one would need a view that depends on tables and another that depends on
Kafka sources. Behavior of materialized views with and without an index is also different, so both types need to be represented.
If you are testing, say, materialized views, make sure that such views are created not only during initialize()
but also during
validate()
. This will confirm that materialized views work not only if created on the freshly-started database, but also on
a database that is in the process of being upgraded or restarted.
If the feature or database object you are testing depends on other objects, make sure to create new objects that depend on both new
and old objects in the manipulate()
sections. For example, materialized views depend on tables, so a comprehensive check will attempt
to create materialized views not only on tables that were created in the initialize()
section, but also on tables that were created
later in the execution. This will confirm that the database is able to perform any type of DDL in the face of restarts or upgrades.
When you add new features or have SQL syntax changes, this can be handled in platform checks by comparing against the base_version
. In a Check
this can be done directly, see for example UUID
:
materialize/misc/python/materialize/checks/uuid.py
Lines 17 to 19 in 42c3c8b
scenario.base_version
, see for example UseClusterdCompute
:
materialize/misc/python/materialize/checks/mzcompose_actions.py
Lines 77 to 100 in 42c3c8b