Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream_output.definition_metadata doesn't contain the metadata since 1.7.11 for observable Source Assets [REGRESSION] #22789

Open
ion-elgreco opened this issue Jul 1, 2024 · 8 comments · May be fixed by #22862
Labels
area: asset Related to Software-Defined Assets area: metadata Related to metadata type: bug Something isn't working

Comments

@ion-elgreco
Copy link
Contributor

ion-elgreco commented Jul 1, 2024

Dagster version

1.7.11

What's the issue?

Since v1.7.11 the InputContext.upstream_output.definition_metadata doesn't contain the metadata anymore of the input asset. This is quite problematic since we have IO managers that rely on this metadata of an asset.

with v1.7.11
image
with v1.7.10
image

What did you expect to happen?

No response

How to reproduce?

No response

Deployment type

None

Deployment details

No response

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

@ion-elgreco ion-elgreco added the type: bug Something isn't working label Jul 1, 2024
@ion-elgreco ion-elgreco changed the title Upstream_output.definition_metadata doesn't contain the metadata since 1.7.11 [REGRESSION] Upstream_output.definition_metadata doesn't contain the metadata since 1.7.11 for Source Assets [REGRESSION] Jul 1, 2024
@ion-elgreco
Copy link
Contributor Author

@garethbrickman Can someone look into this? This breaks all IO managers implementations for source assets

@garethbrickman garethbrickman added area: asset Related to Software-Defined Assets area: metadata Related to metadata labels Jul 1, 2024
@sverbruggen
Copy link

@garethbrickman Would really appreciate if if you could make some time for this, seems like a relatively major issue.

@OwenKephart
Copy link
Contributor

Hi @sverbruggen @ion-elgreco , are either of you able to create a minimal reproduction of this issue? My attempts haven't been successful here, generally trying things along these lines:

def test_input_manager_with_source_assets() -> None:
    fancy_metadata = {"foo": "bar", "baz": 1.23}

    upstream = SourceAsset("upstream", metadata=fancy_metadata)

    @asset(ins={"upstream": AssetIn(input_manager_key="special_io_manager")})
    def downstream(upstream) -> int:
        return upstream + 1

    class MyIOManager(IOManager):
        def load_input(self, context) -> int:
            assert context.upstream_output is not None
            assert context.upstream_output.asset_key == AssetKey(["upstream"])
            assert context.upstream_output.definition_metadata == fancy_metadata

            return 2

        def handle_output(self, context, obj) -> None: ...

    defs = Definitions(
        assets=[upstream, downstream],
        resources={"special_io_manager": IOManagerDefinition.hardcoded_io_manager(MyIOManager())},
    )
    job = defs.get_implicit_job_def_for_assets([downstream.key])
    assert job is not None
    output = job.execute_in_process()

    assert output._get_output_for_handle("downstream", "result") == 3  # noqa: SLF001
    assert False

There was a change which resulted in those system-generated metadata keys (e.g. dagster/asset_execution_type) no longer appearing, but that wouldn't impact the user-provided metadata as you're experiencing.

@ion-elgreco
Copy link
Contributor Author

@OwenKephart I have an MRE in my code base, will share it when I arrive back home tonight!

@OwenKephart
Copy link
Contributor

Thank you!

@ion-elgreco
Copy link
Contributor Author

ion-elgreco commented Jul 5, 2024

@OwenKephart found a way to reproduce it with a slight modification of your example:

from dagster import (
    AssetIn,
    AssetKey,
    ConfigurableIOManager,  # noqa
    DataVersion,
    IOManager,  # noqa
    asset,
    materialize,
    observable_source_asset,
)


def test_input_manager_with_source_assets() -> None:
    fancy_metadata = {"foo": "bar", "baz": 1.23}

    @observable_source_asset(metadata=fancy_metadata)
    def upstream():
        return DataVersion('1')
    # upstream = SourceAsset("upstream", metadata=fancy_metadata)

    @asset(ins={"upstream": AssetIn(input_manager_key="special_io_manager")})
    def downstream(upstream) -> int:
        return upstream + 1

    class MyIOManager(IOManager):
        def load_input(self, context) -> int:
            assert context.upstream_output is not None
            assert context.upstream_output.asset_key == AssetKey(["upstream"])
            assert context.upstream_output.definition_metadata == fancy_metadata
            return 2

        def handle_output(self, context, obj) -> None: ...

    materialize(assets=[upstream,downstream], resources={"special_io_manager": MyIOManager()})
AssertionError: assert {'dagster/io_... 'io_manager'} == {'baz': 1.23, 'foo': 'bar'}
  Left contains 1 more item:
  {'dagster/io_manager_key': 'io_manager'}
  Right contains 2 more items:
  {'baz': 1.23, 'foo': 'bar'}
  Use -v to get more diff

The key is to use observable source assets, and materialize it instead of a job

@ion-elgreco ion-elgreco changed the title Upstream_output.definition_metadata doesn't contain the metadata since 1.7.11 for Source Assets [REGRESSION] Upstream_output.definition_metadata doesn't contain the metadata since 1.7.11 for observable Source Assets [REGRESSION] Jul 5, 2024
@OwenKephart
Copy link
Contributor

@ion-elgreco thanks for the reproduction, we should be able to get a fix in for next week's release

@ion-elgreco
Copy link
Contributor Author

@OwenKephart thanks!! Much appreciated for the quick fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: asset Related to Software-Defined Assets area: metadata Related to metadata type: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants