Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Is UCX catalog now a pre-requisite for the table migration workflow to run? #3438

Open
1 task done
JCZuurmond opened this issue Dec 12, 2024 · 3 comments
Open
1 task done

Comments

@JCZuurmond
Copy link
Member

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

In #2929, the UCX catalog became a prequisite for the migrate-tables command. In #3239, it becomes a prerequisite for the table migration workflows, and the migration progress is tracked during the workflow too.

Expected Behavior

  • Consistent prerequiste: both for the command and workflow. Or neither.
  • If prerequisite, document it as such.

Steps To Reproduce

N.A.

Cloud

Azure

Operating System

macOS

Version

latest via Databricks CLI

Relevant log output

No response

@asnare
Copy link
Contributor

asnare commented Dec 12, 2024

The ucx catalog is where progress (across multiple installations/workspaces) is tracked for displaying in the progress-related dashboards. From a technical perspective it's intentional that the migration-related pieces therefore depend on it so that they can log their progress for it to be reported. From a technical point-of-view it would therefore be consistent for both the command and workflow to require it.

Stepping back a bit, the question in my mind is where should the ucx catalog be created. As far as I can tell:

  • It only needs to be created once, even though multiple UCX installations might be using (sharing) it.
  • The way to create it is with the create-ucx-catalog command, but this isn't documented nor is when it should it happen.

With this in mind, I think the following is needed:

  • The create-ucx-catalog command should be documented.
  • Arguably during attach-metastore we should also attempt to create it if it's not already present. (This needs to be documented.) If we ever need to upgrade this is also an obvious point to check for this. In the alternative, if we decide not to create the ucx catalog automatically here then if the catalog doesn't exist we should at least log a reminder that it needs to be created via create-ucx-catalog.
  • If the user attaches the metastore themselves, it needs to be clear from the docs that the create-ucx-catalog is still needed.

@JCZuurmond
Copy link
Member Author

JCZuurmond commented Dec 12, 2024

Thanks for the good write up.

IMO we should make the UCX catalog required before the migrate tables workflow.

The way to create it is with the create-ucx-catalog command, but this isn't documented nor is when it should it happen.
The create-ucx-catalog command should be documented.

Already is document: https://github.com/databrickslabs/ucx?tab=readme-ov-file#create-ucx-catalog-command.
But not the when, I think it should be part of https://github.com/databrickslabs/ucx?tab=readme-ov-file#new-unity-catalog-resources

Arguably during attach-metastore

I like that, it fits also with when the UCX catalog should be created, i.e. right after attach metastore. At least a reminder should be logged.

If the user attaches the metastore themselves, it needs to be clear from the docs that the create-ucx-catalog is still needed.

Indeed, I think it should be part of https://github.com/databrickslabs/ucx?tab=readme-ov-file#new-unity-catalog-resources

@asnare
Copy link
Contributor

asnare commented Dec 12, 2024

Already is document: https://github.com/databrickslabs/ucx?tab=readme-ov-file#create-ucx-catalog-command. But not the when, I think it should be part of https://github.com/databrickslabs/ucx?tab=readme-ov-file#new-unity-catalog-resources

Ugh… my bad: I searched on create_ucx_catalog instead of create-ucx-catalog. Thanks for the correction.

gueniai added a commit that referenced this issue Dec 12, 2024
…ites in the `migrate-tables` cli command (#3439)

## Changes
Remove `try-except` around verifying the migration progress prerequistes
in the `migrate-tables` cli command as the warning is more specific
inside the class

### Linked issues

Progresses #3438

### Functionality

- [x] modified existing command: `databricks labs ucx migrate-tables`

---------

Co-authored-by: Guenia Izquierdo Delgado <[email protected]>
@gueniai gueniai removed the bug label Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

No branches or pull requests

3 participants