-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/generic testing with inheritance set up #171
Conversation
…transformer.py, removes None functionality from BaseTransformer including editing columns_set_or_check method
…, in BaseTransformer ew_column_name -> new_column_names in DataFrameMethodTransformer
…for BaseTransformer and DataFrameMethodTransformer
… testing framework
…moves mapping transformer init
… generic testing framework
Hmm the tests pass on my machine |
…p missing inherit tests, adds a BaseTransformer.transform() call to mapping transformer so that the relevant input checks are called
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great bit of work and is a good example of rearchitecting the application such that it is easier to extend and maintain.
I think that this is a step on a journey towards a better use of objects to afford easier testing, but it is a big step on that journey. thanks.
No need for changes to the code (other than a couple of doc typos). I'm approving because there's nothing show-stopping.
|
||
|
||
@pytest.fixture() | ||
def minimal_attribute_dict(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S: This is the only bit of the PR that I don't love straight away - it's a very big, nested dictionary and it feels as though it could become a bit of a hairball. Changes to it could break several tests, meaning that its structure is tightly coupled across the tests.
I don't have a better suggestion to keep the MVP idea going, but on a longer timescale I wonder if a collection of dataclasses might be better suited rather than a 'god object' dictionary with all test configurations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree I don't like it also. Do you mean a dataclass for each transformer here?
Another solution I can see would be to define an arguments dictionary at the level of each transformer file in the setup_class method (where we also put the transformer name)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree I don't like it also. Do you mean a dataclass for each transformer here?
Yes, to keep it most modular. But open to alternatives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
This pull request POCs an alternative framework for generic testing. In this approach tests for all inherited behaviours are inherited from base test classes, the structure of which roughly mirrors the inheritance structure of the tubular package. In this PR I have incorporated BaseTransformer, DataFrameMethodTransformer, BaseMappingTransformer, BaseMappingTransformerMixin and MappingTransformer into the generic testing framework. I believe this covers all the possible complexity that would be encountered extending this framework to the rest of the package.
Whilst this first PR is pending I have not updated the process docs on confluence, but will do with all relevant info from here if this PR is completed.
This is a big PR - to get an overview of the approach look at:
I have detailed the other things done in this PR at the bottom.
Quick review of approach:
Pros:
Cons:
More detail on approach and its nuances:
The first nuance of the approach is that any test classes that need to be inherited cannot start with "Test" as is convention for pytest because otherwise all inherited tests will be run twice by virtue of pytest detecting the imported class as a class of tests to run. Instead, where some tests for a class will need to be inherited from a given test class file by others, another test class needs to be present in the test file which will actually run the tests.
One complexity is that the tests for the base transformer init behaviour need to be accessed through a subclass of
GenericInitTests
depending on the structure of the columns input for the tested transformer. This needs to be carefully flagged to a user. The BaseTransformer also has other methods than the classic three with behaviours that need testing for every transformer. For now, to reduce clutter in bottom level test files, I have encapsulated these in theOtherBaseBehaviourTests
class, which can be inherited and run in each transformer test file. The user still has the option to overwrite any of the individual tests in here should any of these behaviours be altered. To alter a behaviour a user simply has make a test method in the child class with the same name as the one they wish to override.Another (ultimately obvious) feature is that expected output checks should never be inherited, and should always sit in the
TestTransform
class of a given test file. In the example of BaseMappingTransformMixin, the only additional tests required other than the generic ones are expected output tests, and so this mixin class does not need to be inherited by any classes that use it (as long as they are correctly using it as a mixin). This is a good example of where choosing the correct set of test classes to inherit requires some thought and may not be obvious to a user.Other changes made in this PR:
columns_set_or_check
method became redundant. I have removed this from the codebase. This mainly affected nominal transformers and their tests, as well the base transformer tests.