Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust calculation of PRICE_EMISSION #726

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open

Adjust calculation of PRICE_EMISSION #726

wants to merge 25 commits into from

Conversation

OFR-IIASA
Copy link
Contributor

This PR revises the calculation of the PRICE_EMISSION to correctly account for the period-duration when deriving the period-specific PRICE_EMISSION from a single marginal value which covers multiple periods (cumulative constraint).

How to review

  • Read the diff and note that the CI checks all pass.
  • Build the documentation and look at revised documentation section.
  • Ensure that changes/additions are self-documenting, i.e. that another
    developer (someone like the reviewer) will be able to understand what the code
    does in the future.

PR checklist

  • Continuous integration checks all ✅
  • Add or expand tests; coverage checks both ✅
  • Add, expand, or update documentation.
  • Update release notes.

@OFR-IIASA OFR-IIASA self-assigned this Jun 30, 2023
@OFR-IIASA OFR-IIASA added the bug Doesn't work as advertised/unintended effects label Jun 30, 2023
@OFR-IIASA OFR-IIASA added this to the 3.8 milestone Jun 30, 2023
@gidden
Copy link
Member

gidden commented Dec 5, 2023

Note that there are 3 relevant failing tests (in test_feature_price_emission.py) but also 2 failing tests not related to this PR (in tests/test_report.py and tests/test_tutorials.py)

@glatterf42
Copy link
Member

I'm not so sure about that.
For test_report.py, this is the error message:

         # message_ix.Reporter can also be initialized
        rep = Reporter.from_scenario(scen)
    
        # Number of quantities available in a rudimentary MESSAGEix Scenario
>       assert 268 == len(rep.graph["all"])
E       assert 268 == 272
E        +  where 272 = len([<ACT:nl-t-yv-ya-m-h>, <ACTIVITY_BOUND_ALL_MODES_LO-margin:n-t-y-h>, <ACTIVITY_BOUND_ALL_MODES_LO:n-t-y-h>, <ACTIVITY_BOUND_ALL_MODES_UP-margin:n-t-y-h>, <ACTIVITY_BOUND_ALL_MODES_UP:n-t-y-h>, <ACTIVITY_BOUND_LO-margin:>, ...])

Here, we have a hardcoded number of quantities we expect in a MESSAGEix Scenario. However, this PR changes the number of quantities by adding these lines to message_ix/models.py:

par("df_year", "y")
par("df_period", "y")
par("levelized_cost", "n t y h")
var(
    "PRICE_EMISSION_NEW",
    "n type_emission type_tec y",
    "TEMPORARY test for Emission price fix",
)

Now, I can't see the whole output of what len() is measuring, but everything that appears there are quantities defined in message_ix/models.py sorted alphabetically. And since our expected number is also off by exactly four, my guess is this error is related.
To fix it, we could simply raise the expected number to 272. However, we probably don't want to keep PRICE_EMISSION_NEW around, it should ideally be merged with PRICE_EMISSION_NEW during the clean-up work that this PR requires. I don't know about the other quantities, but if they remain, the solution here is indeed to just increase the expected number of quantities :)

For test_tutorials.py, I'm not entirely sure what the fix is. Let's start with the complete traceback again:

nb_path = PosixPath('/home/runner/work/message_ix/message_ix/tutorial/westeros/westeros_report.ipynb')
checks = [('len-rep-graph', 13724)]
tmp_path = PosixPath('/tmp/pytest-of-runner/pytest-0/popen-gw1/test_tutorial_westeros_report_5')
tmp_env = environ({'_R_CHECK_SYSTEM_CLOCK_': 'FALSE', 'SELENIUM_JAR_PATH': '/usr/share/java/selenium-server.jar', 'CONDA': '/usr...FRAME_EVAL': 'NO', 'PYTEST_CURRENT_TEST': 'message_ix/tests/test_tutorials.py::test_tutorial[westeros_report] (call)'})

    @pytest.mark.parametrize("nb_path,checks", TUTORIALS, ids=IDS, indirect=["nb_path"])
    def test_tutorial(nb_path, checks, tmp_path, tmp_env):
        """Test tutorial in the IPython or IR notebook at `nb_path`.
    
        If `checks` are given, values in the specified cells are tested.
        """
        if nb_path.name == "westeros_baseline_using_xlsx_import_part2.ipynb":
            # Copy data files used by this tutorial to `tmp_path`
            for name in DATA_FILES:
                copyfile(nb_path.parent / name, tmp_path / name)
    
        # Determine arguments for run_notebook()
        args = default_args()
        if nb_path.name.startswith("R_"):
            args.update(kernel_name="IR")
    
        # The notebook can be run without errors
        nb, errors = run_notebook(nb_path, tmp_path, tmp_env, **args)
        assert errors == []
    
        # Cell(s) identified by name or index have a particular value
        for cell, value in checks:
>           assert np.isclose(get_cell_output(nb, cell), value)
E           assert False
E            +  where False = <function isclose at 0x7f36903354c0>(13775, 13724)
E            +    where <function isclose at 0x7f36903354c0> = np.isclose
E            +    and   13775 = get_cell_output({'cells': [{'attachments': {}, 'cell_type': 'markdown', 'metadata': {}, 'source': '# Westeros Tutorial - Introducing reporting\n\n‘Reporting’ is the term used in the MESSAGEix ecosystem to refer to any calculations performed _after_ the MESSAGE mathematical optimization problem has been solved.\n\nThis tutorial introduces the reporting features provided by the ``ixmp`` and ``message_ix`` packages.\nIt was developed by Paul Natsuo Kishimoto ([@khaeru](https://github.com/khaeru)) for a MESSAGEix training workshop held at IIASA in October 2019.\nParticipants in the MESSAGEix workshops of June and September 2020 contributed feedback.\n<!-- Add a line here if you revise the tutorial! -->\n\n**Pre-requisites**\n- You have the *MESSAGEix* framework installed and working\n  In particular, you should have installed ``message_ix[report]``, which requires ``ixmp[report]``, ``genno[compat]``, and ``plotnine``\n- Complete tutorial Part 1 (``westeros_baseline.ipynb``)\n  - Understand the following MESSAGEix terms: ‘variable’, ‘parameter’\n- Open the [‘Reporting’ page in the MESSAGEix documentation](https://docs.messageix.org/en/stable/reporting.html); bookmark it or keep it open in a tab.\n  S...pe': 'stream', 'name': 'stdout', 'text': "'E':\n- <lambda>\n- 'A':\n  - 1\n- 'C':\n  - sum_calc\n  - 'A' (above)\n  - 'B':\n    - 2\n- 'D':\n  - 12\n"}, {'output_type': 'execute_result', 'metadata': {}, 'data': {'text/plain': '5.0'}, 'execution_count': 31}], 'source': 'rep.add("D", 12)\nrep.add("E", (lambda a, c, d: a + d * (a / c), "A", "C", "D"))\nprint(rep.describe("E"))\n\nrep.get("E")\n'}, {'cell_type': 'code', 'execution_count': 32, 'metadata': {'execution': {'iopub.status.busy': '2023-12-01T09:49:15.984386Z', 'iopub.execute_input': '2023-12-01T09:49:15.984888Z', 'iopub.status.idle': '2023-12-01T09:49:16.144007Z', 'shell.execute_reply': '2023-12-01T09:49:16.142901Z'}}, 'outputs': [], 'source': 'mp.close_db()\n'}], 'metadata': {'kernelspec': {'display_name': 'message_ix', 'language': 'python', 'name': 'python3'}, 'language_info': {'name': 'python', 'version': '3.8.18', 'mimetype': 'text/x-python', 'codemirror_mode': {'name': 'ipython', 'version': 3}, 'pygments_lexer': 'ipython3', 'nbconvert_exporter': 'python', 'file_extension': '.py'}, 'vscode': {'interpreter': {'hash': '29605b568787f5559ca1ba05da75d155c3dcfa15a1a63b1885b057c5465ade2e'}}}, 'nbformat': 4, 'nbformat_minor': 2}, 'len-rep-graph')

As we can see, the error arises related to reporting again. It is comparing another expected hardcoded number to an actual number. Looking at our test setup, we are only checking the result of a single cell in this notebook; the cell len-rep-graph should be equal to 13724. I'm not sure how the names of cells are created, but looking at the tutorial in question, my guess is it's this one:

len(rep.graph)

So this seems related, too. My suggestion for this is very similar to the error above: clean up this PR and see what value comes out (e.g. by quickly running the tutorial), then use that to update the expected value.

@glatterf42
Copy link
Member

This will be postponed together with the originating issue since we don't have enough time for a proper cleanup right now.

@glatterf42 glatterf42 modified the milestones: 3.8, 3.9 Dec 13, 2023
@glatterf42
Copy link
Member

As discussed in the weekly meeting on February 15, @behnam-zakeri will continue this PR. Please let me know if I can assist with anything here.

@behnam-zakeri
Copy link
Contributor

behnam-zakeri commented Mar 28, 2024

@glatterf42 thanks for following up on this in today's meeting. I went through the test and actually the tests are not using Westeros tutorials but building new models for testing this.

I started to test the changes in this PR using Westeros tutorials. I followed the logic below:

  • Clone a "baseline" (scenario I) to a new "emission bound" scenario (scenario II) and add a binding emission bound to it.
  • Solve scenario II and calculate emission prices (here PRICE_EMISSION_NEW).
  • Add the same prices as tax_emission to a clone of "baseline" and call it "emission tax" scenario (scenario III).
    Check the following:
    a. Emission prices are equal for scenario II and scenario III (looking into PRICE_EMISSION_NEW)
    b. Emissions are equal for scenario II and scenario III (looking into EMISS)

I ran this setup for two types of emission bounds, cumulative and yearly, using Westeros tutorials. My observation is that:

  • In both cases, check (a) passes.
  • In both cases, check (b) does not pass.

So, my first reaction is that:

  1. Is the above logic for testing this PR correct?
  2. Why shouldn't we use Westeros tutorials for testing this? (In other words, what changes are brought with the new test that the tutorials cannot replicate?)

Maybe @OFR-IIASA and @gidden can help here.

@gidden
Copy link
Member

gidden commented Mar 28, 2024

Hey @behnam-zakeri - thanks so much for looking into this so quickly. I have a comment and two questions:

  1. In principle there's no reason not to test with the tutorials, in fact I think it's a great idea.
  2. Can you check how close the objective functions are between the two runs? They should be within game solution tolerance
  3. Is the emission bound cumulative or year on year? If cumulative, I'd suggest trying year on year. In principle we may have an issue with hotelling calculations for this

@OFR-IIASA
Copy link
Contributor Author

Dear Behnam,

As there were existing test, the idea was to leave those in place and extend the test examples covered, while also making sure that the results are the SAME as opposed to just "close". The test for "duality" was "close" before, but the allowed tolerance was too large, hence the error in the formulation wasnt caught.
Discussing with Volker at the time when changing the formulation for deriving the price, the idea behind the test was to set up a a simple example with a single technology that has emissions and add-on technologies which would the allow for emission reduction. In the end, I never really got to completing the work, hence the tests dont pass, and this was the main task left to do in the PR, i.e. develop a test case that works.

As you rightly point out, there are a few different cases to test, but I have made a few more adjustments below.

TEST 1.: Equal length time periods // CUMULATIVE bound
-> Clone a "baseline" (scenario I) to a new "emission bound" scenario (scenario II) and add a binding CUMULATIVE emission bound to it.
-> Solve scenario II and calculate emission prices (here PRICE_EMISSION_NEW).
-> Check: that calculated emission price calculation matches a manual calculation of the price based on the EMISSION_EQUIVALENCE

TEST 2.: Unequal length time periods // CUMULATIVE bound
-> Same as "TEST 1", but with changes in the period duration, so that there are 5 and 10 year timesteps.

TEST 3.: Equal length time periods // PERIOD bound
-> Same as "TEST 1", but with period specific bound

TEST 4.: Unequal length time periods // PERIOD bound
-> Same as "TEST 2", but with period specific bound

TEST 5.: Test price duality
-> Scenario A.) Apply different CUMULATIVE bounds to a scenario with differing timeperiod lengths.
-> Scenario B.) Apply the resulting PRICE_EMISSION from scenario A as tax_emission; check:

  • ensure that EMISS are equal for scenarios A and B
  • ensure that tax_emission and PRICE_EMISSION_NEW as equal in scenario B

-> Scenario C.) Apply the emission trajectory (EMISS) from scenario A as a period specific bound_emission; check:

  • ensure that EMISS are equal for scenarios A and C
  • ensure that tax_emission and PRICE_EMISSION_NEW as equal in scenario C

In the end, we need to ensure that the carbon-price from a cumulative constraint, provides the same emission trajectory per period as the cumulative trajectory when applied as tax_emission (Test 5, scenario B) and that the the emission trajectory per period from the cumulative constraint scenario provides the same carbon price as when applying a cumulative constraint (Test 5, Scenario C).

@behnam-zakeri
Copy link
Contributor

@gidden thanks for the reflection.
As @OFR-IIASA mentioned, it seems Westeros tutorials are not enough, even though they can be part of the test, because they do not exhibit the change in the model periods, e.g., from 5-year to 10-year. My tests using the tutorial were both for year-by-year and cumulative bounds.

@behnam-zakeri
Copy link
Contributor

@OFR-IIASA thanks for the updates and confirming the logic for the tests. I see your point, if we need varying model periods we need to develop a new test model.

@glatterf42
Copy link
Member

glatterf42 commented Apr 11, 2024

Thanks for continuing the work here, @behnam-zakeri! Any further improvements to reach our goal are much appreciated :)
Just wanted to drop by to say: don't worry about the code quality check, I can easily fix that afterwards :)

@behnam-zakeri
Copy link
Contributor

behnam-zakeri commented Apr 11, 2024

@glatterf42 I made the following changes and now this PR is ready for reviewing/improving the tests. Indeed, the existing tests are good enough, but a few of them, i.e., test_price_duality don't pass. Here, @OFR-IIASA can confirm that the changes made in this PR in the GAMS formulation work on the global model, but not on the tests developed here, right?

The changes made today are:

  • The GAMS formulation is cleaned up from new variable names and extraneous equations.
  • model.py is cleaned accordingly.
  • test_feature_price_emission.py is cleaned from hard-coded data and temporary text info.
  • "Addon" technologies are disabled in the test, as it seems they don't do anything specific but make the test a bit complicated. The tests behave the same as before without "Addon".
  • Important: For test_price_duality, model periods with equal length are added, and the checks don't pass for them too. So, there is an issue regardless of the model periods being equal or varying between 5 and 10 years.

@glatterf42
Copy link
Member

Still working on getting test case 5 to work. I've tried using make_westeros, but that has its model years ([700, 710, 720]) hardcoded, so doesn't work out of the box.
Instead, I've adapted the model_setup from what @OFR-IIASA started and the tests are starting to run with what look like reasonable results.
In order to troubleshoot the tests, I included tests/feature_price_emission.ipynb, which uses a new local DB instance to avoid cluttering existing ones. Run something similar to ixmp platform add test_feature_price_emission jdbc hsqldb /home/fridolin/.local/share/ixmp/localdb/test_feature_price_emission to use the same name or adapt it as you see fit.
When I'm running this notebook (which for now just includes evenly-spaced years), I find that whatever variable we're fixing works fine, while the other doesn't. So in 5.1, we fix the tax_emission based on PRICE_EMISSION and this is registered correctly, but the EMISS trajectory is off. In 5.2, we apply a period-specific bound_emission, which leads to the same trajectory, but different prices.
In 5.2, the ACT of the technologies seems to be different, while in 5.1, only the the marginals of ACT seem to differ.

This might speak to a problem in the conversion between emission bounds and emission prices.

If anyone has any idea how to troubleshoot this further or even what specifically might cause this problem, I'm happy to take suggestions :)

@glatterf42
Copy link
Member

I'm sad to have to postpone this (and #723) beyond 3.9, but it doesn't look like this PR will be ready still this week. We could consider releasing 3.9.1 if this gets merged not too long after our 3.9 release.

@glatterf42
Copy link
Member

@behnam-zakeri suggested earlier today that it might be useful to take the price_emission notebook temporarily included here and run it on the main branch to see if it works there.

@glatterf42
Copy link
Member

I'm running various scenarios locally, trying out different combinations of technologies, formulations of PRICE_EMISSION, etc. So far, no dice in getting the test to run.

Today, I have noticed the following: following @OFR-IIASA's description for TEST 5 is very similar to what we're doing in the emissions_taxes tutorial. There, @behnam-zakeri has written the following when comparing PRICE_EMISSION between A and C, essentially: "Comparing the emission prices between the two scenarios at this stage, we see that the values are not identical. The reason is that when we introduce emission bounds per year, the price of emission in each year reflects the cost occurring when reducing one more unit of emission in that year. However, in the scenario with a cumulative bound over the entire model horizon, the price of emission reflects the cost of the system in reducing one more unit of emission over the entire model horizon."
So this is making me wonder: why do we expect the new test to work now? The calculation of PRICE_EMISSION changed, but our other tests still pass, so it is still following the same logic. So why should the two kinds of prices converge now? Is there any setting where they do converge?

However, I asked the same in the MESSAGE meeting yesterday and the consensus was that the test should run if the model setup is right, i.e. if there are enough gradual options to shift from high to low emission-technologies and the model is not just using one until it can't, when it switches entirely to another technology.
With that in mind, I have adapted the test (and debugging-notebook) setup to follow the values for technologies we use in the westeros tutorial more closely. The test_price_duality still doesn't work with them, but I noticed that the numbers reported as EMISS seemed off, especially in the scenario B (where we only fix the tax_emission). So I wondered why we removed the rescaling of EMISSION_CONSTRAINT in this PR and reintroduced them. The EMISS in e.g. scenario A changed after that, but everything in scenario B stayed exactly the same, so still way too much emission for the bound I added. For comparison:

cumulative_bound = 40

> print(scenario_cumulative_bound.var("EMISS"))
     node emission type_tec  year   lvl  mrg
0   World      CO2      all  2020  85.0  0.0
1   World      CO2      all  2025  85.0  0.0
2   World      CO2      all  2030  85.0  0.0
3   World      CO2      all  2040   0.0  0.0
4   World      CO2      all  2045   0.0  0.0
5   World      CO2      all  2050  25.0  0.0
6    node      CO2      all  2020  85.0  0.0
7    node      CO2      all  2025  85.0  0.0
8    node      CO2      all  2030  85.0  0.0
9    node      CO2      all  2040   0.0  0.0
10   node      CO2      all  2045   0.0  0.0
11   node      CO2      all  2050  25.0  0.0

> print(scen_tax.var("EMISS"))
     node emission type_tec  year    lvl  mrg
0   World      CO2      all  2020  500.0  0.0
1   World      CO2      all  2025  500.0  0.0
2   World      CO2      all  2030  500.0  0.0
3   World      CO2      all  2040    0.0  0.0
4   World      CO2      all  2045    0.0  0.0
5   World      CO2      all  2050   25.0  0.0
6    node      CO2      all  2020  500.0  0.0
7    node      CO2      all  2025  500.0  0.0
8    node      CO2      all  2030  500.0  0.0
9    node      CO2      all  2040    0.0  0.0
10   node      CO2      all  2045    0.0  0.0
11   node      CO2      all  2050   25.0  0.0

This now makes me think: our current way of calculating PRICE_EMISSION is not at all affected by EMISSION_CONSTRAINT. And conversely, EMISSION_CONSTRAINT does not depend on PRICE_EMISSION, either. So in a scenario where EMISSION_CONSTRAINT is the driving force behind EMISS, we would not expect EMISS to respond to PRICE_EMISSION, or would we?
However, if I understand correctly, EMISS is defined as the product of emission_factor and ACT (in the absence of land-use values), and is subject to just this constraint, EMISSION_CONSTRAINT. So it might be possible to fine-tune a scenario such that the bounds on ACT are stringent enough to overrule EMISSION_CONSTRAINT as the driving force behind EMISS, but that seems beside the point: we want to show that setting a bound and setting an emission trajectory (that correspond to each other) are equivalent.
Thus, I'm also wondering: why did we switch away from EMISSION_CONSTRAINT as the basis for the PRICE_EMISSION calculation, and should we maybe switch back, to couple both things again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Doesn't work as advertised/unintended effects
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deviation between PRICE_EMISSION calculation and tax_emission use in OBJECTIVE-function
5 participants