Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/generic testing with inheritance set up #171

Merged
merged 24 commits into from
Feb 21, 2024

Conversation

TommyMatthews
Copy link
Contributor

@TommyMatthews TommyMatthews commented Jan 31, 2024

This pull request POCs an alternative framework for generic testing. In this approach tests for all inherited behaviours are inherited from base test classes, the structure of which roughly mirrors the inheritance structure of the tubular package. In this PR I have incorporated BaseTransformer, DataFrameMethodTransformer, BaseMappingTransformer, BaseMappingTransformerMixin and MappingTransformer into the generic testing framework. I believe this covers all the possible complexity that would be encountered extending this framework to the rest of the package.

Whilst this first PR is pending I have not updated the process docs on confluence, but will do with all relevant info from here if this PR is completed.

This is a big PR - to get an overview of the approach look at:

  • conftest.py (info on all transformers allowing them to be automatically tested)
  • base_tests.py (base generic classes to inherit)
  • specific_columns_type_tests.py (name pending - generic init test interfaces depending on column arg type) have now deleted these and moved them to base_tests.py as it felt cleaner
  • test_BaseTransformer.py
  • test_DataFrameMethodTransformer.py
  • test_BaseMappingTransformer.py
  • test_BaseMappingTransfomMixin.py
  • test_MappingTransformer.py

I have detailed the other things done in this PR at the bottom.

Quick review of approach:

Pros:

  • All tests for a given transformer a run in the same file
  • This approach requires very minimal code to add a new transformer to
  • Transformer specific test files are pretty empty, only containing the behaviour tests relevant to that transformer. This will help us focus on whether the important behaviours are being tested.
  • The practice of having tests inherited strongly discourages introducing conditional logic into the generic testing suite to get transformers to pass the base tests.

Cons:

  • No simple logic for which base test classes to inherit, especially in the case of init and where mixins are present.
  • Slightly higher software skill level required to add transformer to test suite
  • Locating the failing test may require checking 2+ test files
  • No automatic check that generic tests are being run (compared to loading all transformers into one master generic test file and testing them)

More detail on approach and its nuances:
The first nuance of the approach is that any test classes that need to be inherited cannot start with "Test" as is convention for pytest because otherwise all inherited tests will be run twice by virtue of pytest detecting the imported class as a class of tests to run. Instead, where some tests for a class will need to be inherited from a given test class file by others, another test class needs to be present in the test file which will actually run the tests.

One complexity is that the tests for the base transformer init behaviour need to be accessed through a subclass of GenericInitTests depending on the structure of the columns input for the tested transformer. This needs to be carefully flagged to a user. The BaseTransformer also has other methods than the classic three with behaviours that need testing for every transformer. For now, to reduce clutter in bottom level test files, I have encapsulated these in the OtherBaseBehaviourTests class, which can be inherited and run in each transformer test file. The user still has the option to overwrite any of the individual tests in here should any of these behaviours be altered. To alter a behaviour a user simply has make a test method in the child class with the same name as the one they wish to override.

Another (ultimately obvious) feature is that expected output checks should never be inherited, and should always sit in the TestTransform class of a given test file. In the example of BaseMappingTransformMixin, the only additional tests required other than the generic ones are expected output tests, and so this mixin class does not need to be inherited by any classes that use it (as long as they are correctly using it as a mixin). This is a good example of where choosing the correct set of test classes to inherit requires some thought and may not be obvious to a user.

Other changes made in this PR:

  • The consistent args part of this project required removing the columns = None option from BaseTransformer. This broke a huge number of tests, which I have fixed. Where the broken test was a soon to be redundant implementation test, I deleted it. This makes up most of the changes to other test files.
  • With the above change, the columns_set_or_check method became redundant. I have removed this from the codebase. This mainly affected nominal transformers and their tests, as well the base transformer tests.
  • Upon getting some errors trying to test MappingTransformer with the generic tests I realised it's init method was not adding anything over the BaseMappingTransformer init, so I deleted it. All tests still passed. I have also had to add a BaseTransformer.transform() call at the start of the transform method so that the correct data checks are called before the input data is used. I am suspicious that we are not using Mixin classes correctly here.

@TommyMatthews
Copy link
Contributor Author

Hmm the tests pass on my machine

@TommyMatthews TommyMatthews changed the title Feature/generic with inheritance set up Feature/generic testing with inheritance set up Feb 5, 2024
adamsardar
adamsardar previously approved these changes Feb 20, 2024
Copy link
Contributor

@adamsardar adamsardar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great bit of work and is a good example of rearchitecting the application such that it is easier to extend and maintain.

I think that this is a step on a journey towards a better use of objects to afford easier testing, but it is a big step on that journey. thanks.

No need for changes to the code (other than a couple of doc typos). I'm approving because there's nothing show-stopping.

CHANGELOG.rst Outdated Show resolved Hide resolved
CHANGELOG.rst Outdated Show resolved Hide resolved
tests/base/test_DataFrameMethodTransformer.py Show resolved Hide resolved


@pytest.fixture()
def minimal_attribute_dict():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S: This is the only bit of the PR that I don't love straight away - it's a very big, nested dictionary and it feels as though it could become a bit of a hairball. Changes to it could break several tests, meaning that its structure is tightly coupled across the tests.

I don't have a better suggestion to keep the MVP idea going, but on a longer timescale I wonder if a collection of dataclasses might be better suited rather than a 'god object' dictionary with all test configurations.

Copy link
Contributor Author

@TommyMatthews TommyMatthews Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree I don't like it also. Do you mean a dataclass for each transformer here?

Another solution I can see would be to define an arguments dictionary at the level of each transformer file in the setup_class method (where we also put the transformer name)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree I don't like it also. Do you mean a dataclass for each transformer here?

Yes, to keep it most modular. But open to alternatives.

Copy link
Contributor

@davidhopkinson26 davidhopkinson26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@TommyMatthews TommyMatthews merged commit 7b21a49 into main Feb 21, 2024
1 check passed
@davidhopkinson26 davidhopkinson26 deleted the feature/generic_with_inheritance_set_up branch February 21, 2024 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants