Skip to content

Conversation

@MImmesberger
Copy link
Collaborator

@MImmesberger MImmesberger commented Mar 17, 2025

This PR is based on #805 and turns the tests back on which were previously skipped.

Todos:

  • Turn all tests on
  • Rename test dirs

@hmgaudecker hmgaudecker mentioned this pull request Mar 17, 2025
7 tasks
Copy link
Collaborator

@hmgaudecker hmgaudecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of small suggestions! Might be a good idea to explain (very briefly!!!) in a comment here why we are moving from two classes (PolicyTestSet and PolicyTestData) to a single one.

@MImmesberger
Copy link
Collaborator Author

Just to be sure: We want to turn name_clashes for dags.dag_tree.create_dag_tree and dags.concatenate_functions_tree off, right? With 'raise' it throws an error because some of our namespaces start with the name of another namespace and have the same leaf names, e.g. sozialversicherung__rente__altersrente__langjährig and sozialversicherung__rente__altersrente both have altersgrenze_vorzeitig as leafs.

Tbh, I'm not 100% sure what this check is supposed to do.

@hmgaudecker
Copy link
Collaborator

Just to be sure: We want to turn name_clashes for dags.dag_tree.create_dag_tree and dags.concatenate_functions_tree off, right? With 'raise' it throws an error because some of our namespaces start with the name of another namespace and have the same leaf names, e.g. sozialversicherung__rente__altersrente__langjährig and sozialversicherung__rente__altersrente both have altersgrenze_vorzeitig as leafs.

Tbh, I'm not 100% sure what this check is supposed to do.

Can't check in detail right now, but IIRC this should mirror the columns_overriding_functions from earlier GETTSIM versions. It definitely should not raise when the same leaf name is used in different namespaces. Could you push a version that raises such an error and point me to it?

@MImmesberger
Copy link
Collaborator Author

The current version does raise it. Run
pytest src/_gettsim_tests/test_aggregate_by_p_id.py.

(Maybe the test path is incorrect typed it out of memory, current not on my machine)

@MImmesberger
Copy link
Collaborator Author

Just a couple of small suggestions! Might be a good idea to explain (very briefly!!!) in a comment here why we are moving from two classes (PolicyTestSet and PolicyTestData) to a single one.

So before we did the following:

  1. Create an PolicyTestData object that contains the content of an entire test file
  2. Store all PolicyTestData objects in a list
  3. Create an PolicyTestSet object that takes the list of PolicyTestData. Here, every test file is broken down into multiple (input_data x one expected output column) tuples via the parametrize_args property of the class.

I thought this is a bit complicated. We now have only one class (PolicyTest) which contains exactly what we need for one test (input columns, one expected output column). Everything else is done outside of classes (especially this parametrize functionality).

@hmgaudecker
Copy link
Collaborator

pytest src/_gettsim_tests/test_aggregate_by_p_id.py

Just a couple of small suggestions! Might be a good idea to explain (very briefly!!!) in a comment here why we are moving from two classes (PolicyTestSet and PolicyTestData) to a single one.

So before we did the following:

1. Create an `PolicyTestData` object that contains the content of an entire test file

2. Store all `PolicyTestData` objects in a list

3. Create an `PolicyTestSet` object that takes the list of `PolicyTestData`. Here, every test file is broken down into multiple  (input_data x one expected output column) tuples via the `parametrize_args` property of the class.

I thought this is a bit complicated. We now have only one class (PolicyTest) which contains exactly what we need for one test (input columns, one expected output column). Everything else is done outside of classes (especially this parametrize functionality).

Thanks! Let's see when things are back how speed looks like. I guess if there are many tests with 5 output columns, each previously ran once, now they will run 5 times? Might not be a big deal though.

@MImmesberger
Copy link
Collaborator Author

No, a test did run once per output column before as well. So hopefully no decrease in speed!

@hmgaudecker
Copy link
Collaborator

Tbh, I'm not 100% sure what this check is supposed to do.

Okay, simply too much time has passed since and it seems like I misremembered the resolution we had found on the following. Say we have:

einkommensteuer
  einkünfte
    aus_selbstständiger_arbeit
    aus_vermietung_und_verpachtung

We wanted to be able to write

einkünfte__aus_selbstständiger_arbeit

inside einkommensteuer, i.e., use relative paths anywhere inside the tree. This means, however, that we needed to make sure that this is unique anywhere.

We wrote #805 like this option did not exist; I do think now that this is a good thing even if we need to disable E501 globally.

I set name_clashes="ignore" and this will definitely be the behaviour we want. Will probably remove that option in dags.

@hmgaudecker
Copy link
Collaborator

durchschnittliche_entgeltpunkte_zuschlag

I think the convention is mean_entgeltpunkte_zuschlag, in any case we should be consistent!

@MImmesberger
Copy link
Collaborator Author

We're now back to using qualified names everywhere. All tests pass -- and are much faster now 🙂

Copy link
Collaborator

@hmgaudecker hmgaudecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful! Just some tiny changes, mostly a suggestion for the global namespace.

@MImmesberger MImmesberger linked an issue Mar 23, 2025 that may be closed by this pull request
@MImmesberger
Copy link
Collaborator Author

MImmesberger commented Mar 23, 2025

I have adjusted the namespaces according to you suggestions. Needed to change the infrastructure a tiny bit to allow for aggregation targets to be in different namespaces than aggregation sources, but I think this is something we should do anyhow.

Copy link
Collaborator

@hmgaudecker hmgaudecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! Thanks!

Copy link
Collaborator

@hmgaudecker hmgaudecker Mar 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So these can't just live in the global namespace

(I keep forgetting about taxes/transfers. I guess this is the global namespace, in a way)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they're currently in the global namespace!

@MImmesberger MImmesberger merged commit b3e19cf into namespaces-renamings Mar 23, 2025
5 of 7 checks passed
@MImmesberger MImmesberger deleted the namespaces-turn-tests-on branch March 23, 2025 14:43
MImmesberger added a commit that referenced this pull request Mar 23, 2025
### What problem do you want to solve?

This PR provides the necessary renamings of taxes and transfers
functions for #804.

ToDo:
- [x] Create new directory structure
- [x] Rename all function arguments
- [x] Set namespace of basic input variables
- [x] Update `pyproject.toml` to reflect new file structure
- [x] Make sure tests run (#841)
- [x] `kinderfreibetragempfänger` $\rightarrow$
`kinderfreibetragsempfänger`
- [x] Link issue #842 in relevant docstrings

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Hans-Martin von Gaudecker <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants