Skip to content

[HUDI-9377] feat(datahub-sync): adds DataPlatformInstance aspect#13133

Merged
xushiyan merged 1 commit intoapache:masterfrom
acryldata:feat-sync-datahub-data-platform-instance
May 5, 2025
Merged

[HUDI-9377] feat(datahub-sync): adds DataPlatformInstance aspect#13133
xushiyan merged 1 commit intoapache:masterfrom
acryldata:feat-sync-datahub-data-platform-instance

Conversation

@sgomezvillamor
Copy link
Contributor

@sgomezvillamor sgomezvillamor commented Apr 11, 2025

Change Logs

This update introduces the ability to set DataPlatformInstance via configuration.

Before this PR:

  • DataPlatformInstance was not emitted for either Dataset or Container entities.
  • It was automatically injected by the DataHub backend, only for Dataset, and included only the platform field.

With this PR:

  • If no platform instance is configured:

    • DataPlatformInstance is consistently emitted for both Dataset and Container entities, including only the platform field.
  • If a platform instance is configured:

    • DataPlatformInstance is consistently emitted for both Dataset and Container entities, including both the platform and instance fields.
    • The platform instance is included in the browse paths.
    • The platform instance is also included in the identity (URN) for both Dataset and Container.

Consistent and correct management of DataPlatformInstance aspect is relevant for the platform instance filtering in the DataHub UI as well as the browsing experience.

Impact

Low

Risk level (write none, low medium or high below)

None

Documentation Update

  • The new optional config feature needs to be documented

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@github-actions github-actions bot added the size:M PR with lines of changes in (100, 300] label Apr 11, 2025
@sgomezvillamor sgomezvillamor marked this pull request as ready for review April 11, 2025 07:30
@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@xushiyan xushiyan changed the title feat(datahub-sync): adds DataPlatformInstance aspect [HUDI-9377] feat(datahub-sync): adds DataPlatformInstance aspect May 5, 2025
@xushiyan xushiyan merged commit d502a4b into apache:master May 5, 2025
60 of 63 checks passed
alexr17 pushed a commit to alexr17/hudi that referenced this pull request Aug 25, 2025
…che#13133)

This update introduces the ability to set `DataPlatformInstance` via configuration.

**Before this PR:**

* `DataPlatformInstance` was not emitted for either `Dataset` or `Container` entities.
* It was automatically injected by the DataHub backend, only for `Dataset`, and included only the `platform` field.

**With this PR:**

* If no platform instance is configured:
  * `DataPlatformInstance` is consistently emitted for both `Dataset` and `Container` entities, including only the `platform` field.

* If a platform instance is configured:
  * `DataPlatformInstance` is consistently emitted for both `Dataset` and `Container` entities, including both the `platform` and `instance` fields.
  * The platform instance is included in the browse paths.
  * The platform instance is also included in the identity (URN) for both `Dataset` and `Container`.

Consistent and correct management of `DataPlatformInstance` aspect is relevant for the platform instance filtering in the DataHub UI as well as the browsing experience.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M PR with lines of changes in (100, 300]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants