Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ releases are available on [Anaconda.org](https://anaconda.org/conda-forge/gettsi

## Unpublished

- {gh}`654` Rename `hh_id` to `vg_id` ({ghuser}`lars-reimann`).
- {gh}`590` Add allowance for child income for Kinderzuschlag.
({ghuser}`ChristianZimpelmann`).
- {gh}`624` Don't create functions for other time units if this leads to a cycle in the
Expand Down
12 changes: 6 additions & 6 deletions docs/geps/gep-01.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ a nutshell and without explanations, these conventions are:
Internal variables should be used sparingly.

1. If names need to be concatenated for making clear what a column name refers to (e.g.,
`arbeitsl_geld_2_vermög_freib_hh` vs. `grunds_im_alter_vermög_freib_hh`), the group
`arbeitsl_geld_2_vermög_freib_vg` vs. `grunds_im_alter_vermög_freib_vg`), the group
(i.e., the tax or transfer) that a variable refers to appears first.

1. Because of the necessity of concatenated column names, there will be conflicts
Expand Down Expand Up @@ -112,18 +112,18 @@ changed, even if it leads to long variable names (e.g., `kinderfreib`,
less error-prone.

If names need to be concatenated for making clear what a column name refers to (e.g.,
`arbeitsl_geld_2_vermög_freib_hh` vs. `grunds_im_alter_vermög_freib_hh`), the group
`arbeitsl_geld_2_vermög_freib_vg` vs. `grunds_im_alter_vermög_freib_vg`), the group
(i.e., the tax or transfer) that a variable refers to appears first.

If a column has a reference to a time unit (i.e., any flow variable like earnings or
transfers), a column is indicated by an underscore plus one of {`y`, `m`, `w`, `d`}.

The default unit a column refers to is an individual. In case a household or tax unit is
the relevant unit, an underscore plus one of {`hh`, `tu`} will indicate the level of
the relevant unit, an underscore plus one of {`vg`, `tu`} will indicate the level of
aggregation.

Time unit identifiers always appear before unit identifiers (e.g.,
`arbeitsl_geld_2_m_hh`).
`arbeitsl_geld_2_m_vg`).

## Parameters of the taxes and transfers system

Expand All @@ -136,7 +136,7 @@ general naming considerations here.
- Parameter names should be generally be aligned with relevant column names. However,
since the group is not repeated for the parameter, it is often better not to
abbreviate them (e.g., `wohngeld_params["vermögensgrundfreibetrag"]` for the parameter
and `wohngeld_nach_vermög_check_m_hh` for a column derived from it).
and `wohngeld_nach_vermög_check_m_vg` for a column derived from it).

## Other Python identifiers (Functions, Variables)

Expand All @@ -150,7 +150,7 @@ A function that is used in many different places should have a descriptive name.
The name of variables should reflect the content or meaning of the variable and not the
type (i.e., int, dict, list, df, array ...). As for column names and parameters, in some
cases it might be useful to append an underscore plus one of {`m`, `w`, `d`} to indicate
the time unit and one of {`hh`, `tu`} to indicate the unit of aggregation.
the time unit and one of {`vg`, `tu`} to indicate the unit of aggregation.

## Examples

Expand Down
10 changes: 5 additions & 5 deletions docs/geps/gep-02.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ in the data provided by the user (if it comes in the form of a DataFrame) or cal
by GETTSIM. All these arrays have the same length. This length corresponds to the number
of individuals. Functions operate on a single row of data.

If a column name is `[x]_id` with `x` {math}`\in \{` `hh`, `tu` {math}`\}`, it will be
If a column name is `[x]_id` with `x` {math}`\in \{` `vg`, `tu` {math}`\}`, it will be
the same for all households, tax units, or any other grouping of individuals specified
in {ref}`GEP 1 <gep-1-column-names>`.

Expand Down Expand Up @@ -113,7 +113,7 @@ that case, only the relevant steps apply.
### Grouped values and aggregation functions

Often columns refer to groups of individuals. Such columns have a suffix indicating the
group (see {ref}`GEP 1 <gep-1-column-names>`, currently `_hh` or `_tu`). These columns'
group (see {ref}`GEP 1 <gep-1-column-names>`, currently `_vg` or `_tu`). These columns'
values will be repeated for all individuals who form part of a group.

By default, GETTSIM will check consistency on input columns in this respect. Users will
Expand All @@ -129,15 +129,15 @@ Aggregation functions will be provided by GETTSIM.
- As outlined in {ref}`GEP 4 <gep-4-aggregation-functions>` users will need to specify:

- The stringified name of the aggregated variable. This **must** end with a feasible
unit of aggregation, i.e., `_hh` or `_tu`
unit of aggregation, i.e., `_vg` or `_tu`
- The stringified name of the original variable.
- The type of aggregation {math}`\in \{` `sum`, `mean`, `max`, `min`, `any` {math}`\}`

Note that as per {ref}`GEP 4 <gep-4-aggregation-functions>`, sums will be calculated
implicitly if the graph contains a column `my_col` and an aggregate such as
`my_col_hh` is requested somewhere.
`my_col_vg` is requested somewhere.

Note that the groups `tu` and `hh` may change in the future. Some might also be
Note that the groups `tu` and `vg` may change in the future. Some might also be
calculated via relations between household members, see
[discussion](https://gettsim.zulipchat.com/#narrow/stream/224837-High-Level-Architecture/topic/Update.20Data.20Structures/near/180917151)
on Zulip in this respect.
Expand Down
30 changes: 15 additions & 15 deletions docs/geps/gep-04.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,11 +225,11 @@ For example, in `demographic_vars.py`, we could have:
```
aggregation_demographic_vars = {
"anz_erwachsene_tu": {"source_col": "erwachsen", "aggr": "sum"},
"haushaltsgröße_hh": {"aggr": "count"},
"haushaltsgröße_vg": {"aggr": "count"},
}
```

The group identifier (`tu_id`, `hh_id`) will be automatically included as an argument;
The group identifier (`tu_id`, `vg_id`) will be automatically included as an argument;
for `count` no other variable is necessary.

The output type will be the same as the input type. Exceptions:
Expand All @@ -241,35 +241,35 @@ The output type will be the same as the input type. Exceptions:

The most common operation are sums of individual measures. GETTSIM adds the following
syntactic sugar: In case an individual-level column `my_col` exists, the graph will be
augmented with a node including a group sum like `my_col_hh` should that be requested.
augmented with a node including a group sum like `my_col_vg` should that be requested.
Requests can be either inputs in a downstream function or explicit targets of the
calculation.

Automatic summation will only happen in case no column `my_col_hh` is explicitly set.
Automatic summation will only happen in case no column `my_col_vg` is explicitly set.
Using a different reduction function than the sum is as easy as explicitly specifying
`my_col_hh`.
`my_col_vg`.

Consider the following example: the function `kindergeld_m` calculates the
individual-level child benefit payment. `arbeitsl_geld_2_m_hh` calculates
individual-level child benefit payment. `arbeitsl_geld_2_m_vg` calculates
Arbeitslosengeld 2 on the household level (as indicated by the suffix). One necessary
input of this function is the sum of all child benefits on the household level. There is
no function or input column `kindergeld_m_hh`.
no function or input column `kindergeld_m_vg`.

By including `kindergeld_m_hh` as an argument in the definition of
`arbeitsl_geld_2_m_hh` as follows:
By including `kindergeld_m_vg` as an argument in the definition of
`arbeitsl_geld_2_m_vg` as follows:

```python
def arbeitsl_geld_2_m_hh(kindergeld_m_hh, other_arguments):
def arbeitsl_geld_2_m_vg(kindergeld_m_vg, other_arguments):
...
```

a node `kindergeld_m_hh` containing the household-level sum of `kindergeld_m` will be
a node `kindergeld_m_vg` containing the household-level sum of `kindergeld_m` will be
automatically added to the graph. Its parents in the graph will be `kindergeld_m` and
`hh_id`. This is the same as specifying:
`vg_id`. This is the same as specifying:

```
aggregation_kindergeld = = {
"kindergeld_m_hh": {
"kindergeld_m_vg": {
"source_col": "kindergeld_m",
"aggr": "sum"
}
Expand All @@ -287,8 +287,8 @@ suffix), months `_m`, weeks `_w`, and days `_d`).
In case a column with annual values `[column]` exists, the graph will be augmented with
a node including monthly values like `[column]_m` should that be requested. Requests can
be either inputs in a downstream function or explicit targets of the calculation. In
case the column refers to a different level of aggregation, say `[column]_hh`, the same
applies to `[column]_m_hh`.
case the column refers to a different level of aggregation, say `[column]_vg`, the same
applies to `[column]_m_vg`.

Automatic summation will only happen in case no column `[column]_m` is explicitly set.
Using a different conversion function than the sum is as easy as explicitly specifying
Expand Down
42 changes: 21 additions & 21 deletions docs/gettsim_objects/input_variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@
# Basic input variables

The table below gives an overview of all variables needed to run GETTSIM completely.
Note that the variables with \_hh at the end, have to be constant over the whole
Note that the variables with \_vg at the end, have to be constant over the whole
household.

(hh_id)=
(vg_id)=

## `hh_id`
## `vg_id`

Household identifier

Expand Down Expand Up @@ -114,7 +114,7 @@ Type: bool

## `hat_kinder`

Dummy: Has kids (incl. not in hh)
Dummy: Has kids (incl. not in vg)

Type: bool

Expand Down Expand Up @@ -159,30 +159,30 @@ Monthly capital income

Type: float

(bruttokaltmiete_m_hh)=
(bruttokaltmiete_m_vg)=

## `bruttokaltmiete_m_hh`
## `bruttokaltmiete_m_vg`

Monthly rent expenses for household

Type: float

(heizkosten_m_hh)=
(heizkosten_m_vg)=

## `heizkosten_m_hh`
## `heizkosten_m_vg`

Monthly heating expenses for household

Type: float

- - `wohnfläche_hh`
- - `wohnfläche_vg`
- Size of household dwelling in square meters

Type: float

(bewohnt_eigentum_hh)=
(bewohnt_eigentum_vg)=

## `bewohnt_eigentum_hh`
## `bewohnt_eigentum_vg`

Dummy: Owner-occupied housing

Expand Down Expand Up @@ -248,21 +248,21 @@ Type: int

## `m_elterngeld`

Number of months hh received elterngeld
Number of months vg received elterngeld

Type: int

(m_elterngeld_vat_hh)=
(m_elterngeld_vat_vg)=

## `m_elterngeld_vat_hh`
## `m_elterngeld_vat_vg`

Number of months father received elterngeld

Type: int

(m_elterngeld_mut_hh)=
(m_elterngeld_mut_vg)=

## `m_elterngeld_mut_hh`
## `m_elterngeld_mut_vg`

Number of months mother received elterngeld

Expand Down Expand Up @@ -292,17 +292,17 @@ Level of rents in city (1: low, 3: average)

Type: int

(immobilie_baujahr_hh)=
(immobilie_baujahr_vg)=

## `immobilie_baujahr_hh`
## `immobilie_baujahr_vg`

Construction year of dwelling

Type: int

(vermögen_bedürft_hh)=
(vermögen_bedürft_vg)=

## `vermögen_bedürft_hh`
## `vermögen_bedürft_vg`

Assets for means testing of
household.{ref}`See this page for more details. <means_testing>`
Expand Down Expand Up @@ -543,6 +543,6 @@ Type: int

## `anz_eig_kind_bis_24`

Number of own children below the age of 25 (incl. not in hh)
Number of own children below the age of 25 (incl. not in vg)

Type: int
2 changes: 1 addition & 1 deletion docs/gettsim_objects/means_testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This documentation shall help to understand the composition of the

{ref}`basic input variable <input_variables>`

'vermögen_bedürft_hh'. Despite small differences over the transfers, we decided, for
'vermögen_bedürft_vg'. Despite small differences over the transfers, we decided, for
now, to require only one wealth variable as input and use it for all transfers.

```{note}
Expand Down
8 changes: 4 additions & 4 deletions docs/gettsim_objects/variables_out.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,14 +34,14 @@ You can find their individual calculation in the documentation of all {ref}`func
- Solidarity surcharge on withholding tax
* - {func}`unterhaltsvors_m <_gettsim.functions.unterhaltsvors_m>`
- Alimony advance payment
* - {func}`arbeitsl_geld_2_m_hh <_gettsim.functions.arbeitsl_geld_2_m_hh>`
* - {func}`arbeitsl_geld_2_m_vg <_gettsim.functions.arbeitsl_geld_2_m_vg>`
- Monthly subsistence payment on household level
* - {func}`kinderzuschl_m_hh <_gettsim.functions.kinderzuschl_m_hh>`
* - {func}`kinderzuschl_m_vg <_gettsim.functions.kinderzuschl_m_vg>`
- Monthly additional child benefit, household sum
* - {func}`elterngeld_m <_gettsim.functions.elterngeld_m>`
- Monthly parental leave benefit
* - {func}`wohngeld_m_hh <_gettsim.functions.wohngeld_m_hh>`
* - {func}`wohngeld_m_vg <_gettsim.functions.wohngeld_m_vg>`
- Monthly housing benefit on household level
* - {func}`grunds_im_alter_m_hh <_gettsim.functions.grunds_im_alter_m_hh>`
* - {func}`grunds_im_alter_m_vg <_gettsim.functions.grunds_im_alter_m_vg>`
- Monthly subsistence payment for retirees on household level
```
Loading