Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
124 commits
Select commit Hold shift + click to select a range
ddea097
Alphabetically sort test file contents.
MImmesberger Mar 17, 2025
3644285
Break up into tree.
MImmesberger Mar 17, 2025
67778c1
Revert "Break up into tree."
MImmesberger Mar 17, 2025
2d1c53d
Revert "Alphabetically sort test file contents."
MImmesberger Mar 17, 2025
4628269
Merge branch 'namespaces-renamings' into namespaces-turn-tests-on
MImmesberger Mar 17, 2025
f92550a
Order test content alphabetically.
MImmesberger Mar 17, 2025
6d5b2aa
Break up into tree.
MImmesberger Mar 17, 2025
31a61aa
Make all unskipped tests run.
MImmesberger Mar 17, 2025
73aef31
Draft new PolicyTest class.
MImmesberger Mar 17, 2025
298cd94
Apply np.array for leafs.
MImmesberger Mar 17, 2025
2dc72d8
Dont destroy group_by_functions.
MImmesberger Mar 17, 2025
5191bb7
Dont round group_by_functions. Dont look for source col for count agg…
MImmesberger Mar 17, 2025
5a4a47b
Fix some wrong qualified names in yaml. Change from entgeltpunkte_zus…
MImmesberger Mar 17, 2025
014c740
Ignore name clashes.
hmgaudecker Mar 17, 2025
8796304
Remove nice names, replace by tuples / tree paths.
hmgaudecker Mar 17, 2025
bf59dc0
Revert "Fix some wrong qualified names in yaml. Change from entgeltpu…
MImmesberger Mar 17, 2025
64d70e4
Revert "Break up into tree."
MImmesberger Mar 17, 2025
a83dfe3
Temporary solution to work with qualified names in test data.
MImmesberger Mar 17, 2025
108d154
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 17, 2025
2012c97
Reapply reverted commit: Fix qualified names.
MImmesberger Mar 17, 2025
c4d7be7
Make it possible to use qualified name as source col for aggregation …
MImmesberger Mar 17, 2025
947a9e9
Revert "Make it possible to use qualified name as source col for aggr…
MImmesberger Mar 18, 2025
19d9758
Partly keep changes from last commit: Proper tests of aggregation fun…
MImmesberger Mar 18, 2025
7ca645c
Remove 'nice names' from test.
MImmesberger Mar 18, 2025
4537f53
Use dags commit with Unicode regex.
MImmesberger Mar 18, 2025
8463968
Handle data types for tests.
MImmesberger Mar 18, 2025
c2c9697
Add dates_active to Grundrente now because we apply rounding before f…
MImmesberger Mar 18, 2025
6596b22
Make aggregate_by_p_id run.
MImmesberger Mar 18, 2025
6c1a84c
Move back to dags main branch.
MImmesberger Mar 18, 2025
b364dcf
Use pd.Series as test inputs.
MImmesberger Mar 18, 2025
96814f5
Updated optree.tree_map (0.14.1) does has tree as positional-only arg…
hmgaudecker Mar 18, 2025
9538a32
Use wohnen namespace for Heizkosten and Bruttokaltmiete.
MImmesberger Mar 18, 2025
27d830a
Use correct argument names for soli and est in alg2.
MImmesberger Mar 18, 2025
15b7c9f
Use correct function name in vorrangprüfungen.
MImmesberger Mar 18, 2025
d104d4b
Use pd Series for data checks, use np.array for TT calculations.
MImmesberger Mar 18, 2025
9ffa7df
Update precision and use kind instead of kindergeld__grundsätzlich_an…
MImmesberger Mar 18, 2025
733b130
Some more typos in function arg definitions.
MImmesberger Mar 18, 2025
f1461f9
Add test that annotations are applied to derived function. Currently …
MImmesberger Mar 18, 2025
d677bbc
Activate all tests, draft of new test file for each.
MImmesberger Mar 18, 2025
75d0d02
Fix interface error matches.
MImmesberger Mar 18, 2025
4e76a2f
Use dags function directly.
hmgaudecker Mar 19, 2025
50874c9
Don't use as a decorator.
hmgaudecker Mar 19, 2025
49900a6
Time to turn mypy on...
hmgaudecker Mar 19, 2025
6b289eb
vertrauenss -> vertrauensschutzprüfung
MImmesberger Mar 19, 2025
e36c630
Fix bug that caused data conversion to fail for all DerivedFunctions.
MImmesberger Mar 19, 2025
d20b9ed
Revert to old implementation of Abgeltungssteuer, see issue 843.
MImmesberger Mar 19, 2025
7a08303
Add jüngstes_kind_oder_mehrling function and fix some typos.
MImmesberger Mar 19, 2025
50e806c
Use correct file name for demographics test.
MImmesberger Mar 19, 2025
027d347
Fix some typos in EM Renten module.
MImmesberger Mar 19, 2025
9da85ce
Add rounding specs if missing.
MImmesberger Mar 19, 2025
a3b4d7b
Use the most recent dags, unify flatten operations (#844)
hmgaudecker Mar 19, 2025
2cf23fa
Remove _m _y that refer to counting months or years.
MImmesberger Mar 19, 2025
413d06f
test.test_file -> test.path.
MImmesberger Mar 19, 2025
3a7668c
Rename Anrechnungszeit functions.
MImmesberger Mar 19, 2025
6b84d8b
Forgot to change keyword to path in policy test utils.
MImmesberger Mar 19, 2025
daa8e6f
Some typos in Rente module. Also set decimal tolerance to 1.
MImmesberger Mar 19, 2025
1e51b97
Add some more rounding specs.
MImmesberger Mar 19, 2025
a95b4be
Use monthly Kapitalerträge in Grundrente.
MImmesberger Mar 19, 2025
2cdd2b3
Add _y_sn to absetzbare_betreuungskosten.
MImmesberger Mar 19, 2025
47578ff
Fix Wohngeld module typos.
MImmesberger Mar 19, 2025
7c07cd9
Fix Vorsorgeaufwendungen.
MImmesberger Mar 19, 2025
dfb3926
Unterhaltsvorschuss tests.
MImmesberger Mar 19, 2025
edd8417
Fix Unterhalt module.
MImmesberger Mar 19, 2025
28a50c1
Sozialv Beiträge.
MImmesberger Mar 19, 2025
ca165fb
Use correct namespace for sozialversicherung__rente__wartezeit_45_jah…
MImmesberger Mar 19, 2025
02ddf28
Reactivate all policy env tests.
MImmesberger Mar 19, 2025
60dc01a
Activate Lohnsteuer tests.
MImmesberger Mar 19, 2025
34497cb
Delete Kinderbonus targets if Kinderbonus is not active.2
MImmesberger Mar 19, 2025
68c0dbb
Fix params key in Grunds im Alter.
MImmesberger Mar 19, 2025
b82b8ed
Make all groupings tests run.
MImmesberger Mar 19, 2025
2187879
Allow for non-typed functions, e.g. if provided by user.
MImmesberger Mar 19, 2025
4725eea
Fix Grundrente tests.
MImmesberger Mar 19, 2025
48f5d6f
Rename all tests dirs.
MImmesberger Mar 19, 2025
dd2a7e4
Make full_taxes_transfers run.
MImmesberger Mar 20, 2025
0eb4d33
Simplify test structure.
MImmesberger Mar 20, 2025
d0c7b36
Triviality: Make sure dt is only ever used for dags.tree
hmgaudecker Mar 20, 2025
0f6274b
Some small review comments and move Abgeltungssteuer into Einnkommens…
MImmesberger Mar 20, 2025
3eae81a
Go through all callable names in Sozialversicherung dir.
MImmesberger Mar 20, 2025
27a4c7a
Restructure Vorsorgeaufwendungen.
MImmesberger Mar 20, 2025
284f61a
Merge branch 'namespaces-turn-tests-on' of https://github.com/iza-ins…
MImmesberger Mar 20, 2025
6e8f768
Use _y again in Vorsorgeaufwendungen.
MImmesberger Mar 20, 2025
10d5931
Some indentations in test files.
MImmesberger Mar 20, 2025
5d7641f
Update Vorsorgeaufwand structure.
MImmesberger Mar 20, 2025
176d1bb
Forgot renamings in one module.
MImmesberger Mar 20, 2025
fe5b6d2
Update zve test file name.
MImmesberger Mar 20, 2025
7355a4d
Rename function that shouldnt fall under time conversion.
MImmesberger Mar 20, 2025
c2cfaf8
Update docstrings.
MImmesberger Mar 20, 2025
0472406
Use updated dags.
hmgaudecker Mar 20, 2025
fc359b0
Move p_id and hh_id to global namespace.
hmgaudecker Mar 21, 2025
79734c4
Final updates so that tests pass (80% checked). Make sure we can pass…
hmgaudecker Mar 21, 2025
d7d0a39
Formatting.
hmgaudecker Mar 21, 2025
d1a7e1d
Add profiler, tiny updates to dags.
hmgaudecker Mar 21, 2025
9677811
Update compute_taxes_and_transfers docstring.
MImmesberger Mar 22, 2025
002a9f8
Update dags version.
MImmesberger Mar 22, 2025
3913a79
Add test that checks namespaces of derived functions.
MImmesberger Mar 22, 2025
01b51cb
Update dags version.
MImmesberger Mar 22, 2025
ab3b121
Start to work on rewriting combine_functions_in_tree module.
MImmesberger Mar 22, 2025
a2d9001
Continue working on rewritten draft.
MImmesberger Mar 22, 2025
5e9d4a9
Add annotations to aggregation functions.
MImmesberger Mar 22, 2025
261cfa2
Draft of interface.py.
MImmesberger Mar 22, 2025
26bc513
New separate classes for aggregations and time conversion functions.
MImmesberger Mar 22, 2025
5f86586
Allow for data source cols for derived functions. Delete time convers…
MImmesberger Mar 22, 2025
74af8b2
Update and add combine_function_in_tree tests.
MImmesberger Mar 23, 2025
b01c001
combine_functions_in_tree -> combine_functions.
MImmesberger Mar 23, 2025
d147444
Make tests in shared.py run.
MImmesberger Mar 23, 2025
148ac21
Update rounding tests.
MImmesberger Mar 23, 2025
f830e01
Update interface tests.
MImmesberger Mar 23, 2025
632e666
Fix small bug in _partial_parameters_to_functions.
MImmesberger Mar 23, 2025
628ac95
Fix isinstance checks.
MImmesberger Mar 23, 2025
9f543c8
Fix annotations check.
MImmesberger Mar 23, 2025
5e4f67a
Renamings to make function interface clearer.
MImmesberger Mar 23, 2025
702cd70
Correct typo in test file and revert last commit.
MImmesberger Mar 23, 2025
923c0d7
Use QualName terminology from dags package, rename gettsim_typing to …
hmgaudecker Mar 23, 2025
78bf648
Simplify typing.
hmgaudecker Mar 23, 2025
2bbf183
Apply small renaming suggestions.
MImmesberger Mar 23, 2025
842b77b
Make TYPES_INPUT_VARIABLES a nested structure.
MImmesberger Mar 23, 2025
ccc9ab6
Renamings related to Rente.
MImmesberger Mar 23, 2025
9f21822
Test whether loader can handle policy functions in top-level namespace.
MImmesberger Mar 23, 2025
9d7a640
Apply reordering of namespaces suggestions.
MImmesberger Mar 23, 2025
4707dfb
Fix some namespace typos.
MImmesberger Mar 23, 2025
14508a1
Fix infrastructure to allow for aggregations sources from outside of …
MImmesberger Mar 23, 2025
2ce52a0
Apply all start_date decorators in Grundrente module.
MImmesberger Mar 23, 2025
5efedc1
Formatting, comment.
hmgaudecker Mar 23, 2025
97447d3
Cosmetics.
hmgaudecker Mar 23, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/_gettsim/interface.py
Original file line number Diff line number Diff line change
Expand Up @@ -513,7 +513,7 @@ def _fail_if_data_tree_not_valid(data_tree: NestedDataDict) -> None:
"""
assert_valid_gettsim_pytree(
tree=data_tree,
leaf_checker=lambda leaf: isinstance(leaf, pd.Series | np.ndarray),
leaf_checker=lambda leaf: isinstance(leaf, pd.Series | np.ndarray | list),
tree_name="data_tree",
)
_fail_if_pid_is_non_unique(data_tree)
Expand Down
185 changes: 69 additions & 116 deletions src/_gettsim_tests/_policy_test_utils.py
Original file line number Diff line number Diff line change
@@ -1,158 +1,111 @@
from __future__ import annotations

import datetime
from typing import TYPE_CHECKING, Any
from typing import TYPE_CHECKING

import pandas as pd
import flatten_dict
import yaml

from _gettsim_tests import TEST_DATA_DIR

_ValueDict = dict[str, list[Any]]
from _gettsim.shared import merge_trees

if TYPE_CHECKING:
from pathlib import Path

from _gettsim.gettsim_typing import NestedDataDict, NestedInputStructureDict

class PolicyTestSet:
def __init__(self, policy_name: str, test_data: list[PolicyTestData]):
self.policy_name = policy_name
self.test_data = test_data

@property
def parametrize_args(self) -> list[tuple[PolicyTestData, str]]:
return [(test, column) for test in self.test_data for column in test.output_df]

def merged_input_df(self) -> pd.DataFrame:
return pd.concat([test.input_df for test in self.test_data], ignore_index=True)

def merged_output_df(self) -> pd.DataFrame:
return pd.concat([test.output_df for test in self.test_data], ignore_index=True)

def filter_test_data(
self, *, test_name: str | None = None, date: datetime.date | str | None = None
) -> PolicyTestSet:
"""
Filter the test data in this PolicyTestSet.
class PolicyTest:
"""A class for a single policy test."""

Note that you must pass all arguments of this function by name (and not by
position).
Parameters
----------
test_name : str | None
If provided, only instances of `PolicyTestData` with this name are included
in the result. If None, no filtering is done on test name.
date : datetime.date | str | None
If provided, only instances of `PolicyTestData` with this date are
included in the result. If None, no filtering is done on date.
def __init__(
self,
input_tree: NestedDataDict,
expected_output_tree: NestedDataDict,
test_file: Path,
date: datetime.date,
) -> None:
self.input_tree = input_tree
self.expected_output_tree = expected_output_tree
self.test_file = test_file
self.date = date

Returns
-------
PolicyTestSet
A new PolicyTestSet with the filtered test data.
@property
def target_structure(self) -> NestedInputStructureDict:
flat_target_structure = {
k: None for k in flatten_dict.flatten(self.expected_output_tree)
}
return flatten_dict.unflatten(flat_target_structure)

Examples
--------
>>> data = load_policy_test_data("soli_st")
>>> filtered_by_name = data.filter_test_data(test_name="hh_id_2")
@property
def test_name(self) -> str:
return self.test_file.stem

>>> filtered_by_date = data.filter_test_data(date="1991")
"""

if isinstance(date, str):
date = _parse_date(date)
def load_policy_test_data(policy_name: str) -> list[PolicyTest]:
from _gettsim_tests import TEST_DATA_DIR

filtered_test_data = [
test
for test in self.test_data
if (test_name is None or test.test_name == test_name)
and (date is None or test.date == date)
]
root = TEST_DATA_DIR / policy_name

return PolicyTestSet(self.policy_name, filtered_test_data)
out = []

for path_of_test_file in root.glob("**/*.yaml"):
if _is_skipped(path_of_test_file):
continue

class PolicyTestData:
def __init__( # noqa: PLR0913
self,
policy_name: str,
test_file: Path,
test_name: str,
date: str,
inputs_provided: _ValueDict,
inputs_assumed: _ValueDict,
outputs: _ValueDict,
):
self.policy_name = policy_name
self.test_file = test_file
self.test_name = test_name
self.date = _parse_date(date)
self._inputs_provided = inputs_provided
self._inputs_assumed = inputs_assumed
self._outputs = outputs
with path_of_test_file.open("r", encoding="utf-8") as file:
raw_test_data: NestedDataDict = yaml.safe_load(file)

@property
def input_df(self) -> pd.DataFrame:
return pd.DataFrame.from_dict(
{**self._inputs_provided, **self._inputs_assumed}
).reset_index(drop=True)
out.extend(
_get_policy_tests_from_raw_test_data(
raw_test_data=raw_test_data,
path_of_test_file=path_of_test_file,
)
)

@property
def output_df(self) -> pd.DataFrame:
return pd.DataFrame.from_dict(self._outputs).reset_index(drop=True)
return out

def __repr__(self) -> str:
return (
f"PolicyTestData({self.policy_name}, {self.test_file.name}, "
f"{self.test_name})"
)

def __str__(self) -> str:
relative_path = self.test_file.relative_to(TEST_DATA_DIR)
backslash = "\\"
return f"{str(relative_path).replace(backslash, '/')}"
def _is_skipped(test_file: Path) -> bool:
return "skip" in test_file.stem or "skip" in test_file.parent.name


def load_policy_test_data(policy_name: str) -> PolicyTestSet:
from _gettsim_tests import TEST_DATA_DIR
def _get_policy_tests_from_raw_test_data(
raw_test_data: NestedDataDict, path_of_test_file: Path
) -> list[PolicyTest]:
"""Get a list of PolicyTest objects from raw test data.
root = TEST_DATA_DIR / policy_name
Args:
raw_test_data: The raw test data.
Returns:
A list of PolicyTest objects.
"""
out = []

for test_file in root.glob("**/*.yaml"):
if _is_skipped(test_file):
continue

with test_file.open("r", encoding="utf-8") as file:
test_data: dict[str, dict] = yaml.safe_load(file)
inputs: NestedDataDict = raw_test_data.get("inputs", {})
input_tree: NestedDataDict = merge_trees(
inputs.get("provided", {}), inputs.get("assumed", {})
)
all_expected_outputs: NestedDataDict = raw_test_data.get("outputs", {})

date = test_file.parent.name
test_name = test_file.stem
date: datetime.date = _parse_date(path_of_test_file.parent.name)

inputs: dict[str, dict] = test_data["inputs"]
inputs_provided: _ValueDict = inputs.get("provided", {})
inputs_assumed: _ValueDict = inputs.get("assumed", {})
outputs: _ValueDict = test_data["outputs"]
flat_expected_outputs = flatten_dict.flatten(all_expected_outputs)

for target_name, test_data in flat_expected_outputs.items():
one_expected_output: NestedDataDict = flatten_dict.unflatten(
{target_name: test_data}
)
out.append(
PolicyTestData(
policy_name=policy_name,
test_file=test_file,
test_name=test_name,
PolicyTest(
input_tree=input_tree,
expected_output_tree=one_expected_output,
test_file=path_of_test_file.stem,
date=date,
inputs_provided=inputs_provided,
inputs_assumed=inputs_assumed,
outputs=outputs,
)
)

return PolicyTestSet(policy_name, out)


def _is_skipped(test_file: Path) -> bool:
return "skip" in test_file.stem or "skip" in test_file.parent.name
return out


def _parse_date(date: str) -> datetime.date:
Expand Down
51 changes: 33 additions & 18 deletions src/_gettsim_tests/test_aggregate_by_p_id.py
Original file line number Diff line number Diff line change
@@ -1,36 +1,51 @@
from typing import TYPE_CHECKING

import flatten_dict
import pytest
from pandas.testing import assert_series_equal

from _gettsim.interface import compute_taxes_and_transfers
from _gettsim_tests._helpers import cached_set_up_policy_environment
from _gettsim_tests._policy_test_utils import PolicyTestData, load_policy_test_data
from _gettsim_tests._policy_test_utils import PolicyTest, load_policy_test_data

if TYPE_CHECKING:
import datetime

from _gettsim.gettsim_typing import NestedDataDict, NestedInputStructureDict

OVERRIDE_COLS = []

data = load_policy_test_data("aggregate_by_p_id")
test_data = load_policy_test_data("aggregate_by_p_id")


@pytest.mark.xfail(reason="Needs renamings PR.")
@pytest.mark.parametrize(
("test_data", "column"),
data.parametrize_args,
ids=str,
"test",
test_data,
)
def test_aggregate_by_p_id(
test_data: PolicyTestData,
column: str,
test: PolicyTest,
):
df = test_data.input_df
environment = cached_set_up_policy_environment(date=test_data.date)
date: datetime.date = test.date
input_tree: NestedDataDict = test.input_tree
expected_output_tree: NestedDataDict = test.expected_output_tree
target_structure: NestedInputStructureDict = test.target_structure

environment = cached_set_up_policy_environment(date=date)

result = compute_taxes_and_transfers(
data=df, environment=environment, targets=column
data_tree=input_tree, environment=environment, targets_tree=target_structure
)

assert_series_equal(
result[column],
test_data.output_df[column],
check_dtype=False,
atol=1e-1,
rtol=0,
)
flat_result = flatten_dict.flatten(result)
flat_expected_output_tree = flatten_dict.flatten(expected_output_tree)

for result_series, expected_series in zip(
flat_result.values(), flat_expected_output_tree.values()
):
assert_series_equal(
result_series,
expected_series,
check_dtype=False,
atol=1e-1,
rtol=0,
)
17 changes: 11 additions & 6 deletions src/_gettsim_tests/test_interface.py
Original file line number Diff line number Diff line change
Expand Up @@ -718,16 +718,21 @@ def test_provide_endogenous_groupings(data, functions_overridden):
(
{
"demographics": {"hh_id": pd.Series([1, "1", 2])},
"einkommen": {"bruttolohn_m": pd.Series(["2000", 3000, 4000])},
"einkommensteuer": {
"einkünfte": {
"aus_nichtselbstständiger_arbeit": {
"bruttolohn_m": pd.Series(["2000", 3000, 4000])
}
}
},
},
{},
"The data types of the following columns are invalid:\n"
"\n - demographics__hh_id: Conversion from input type object to int failed."
" Object\ntype is not supported as input."
"\n\n- "
"einkommensteuer__einkünfte__aus_nichtselbstständiger_arbeit__bruttolohn_m:"
" Conversion from input type object to float failed."
"\nObject type is not supported as input.",
" Object\ntype is not supported as input.\n"
"\n- einkommensteuer__einkünfte__aus_nichtselbstständiger_arbeit__bruttolohn_m:" # noqa: E501
"\nConversion from input type object to float failed. "
"Object type is not supported\nas input.",
),
],
)
Expand Down
Loading