Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: Allow table mapping to be retrieved from DB SQL #1414

Open
1 task done
Tracked by #1528
rwforest opened this issue Apr 16, 2024 · 5 comments
Open
1 task done
Tracked by #1528

[FEATURE]: Allow table mapping to be retrieved from DB SQL #1414

rwforest opened this issue Apr 16, 2024 · 5 comments
Labels
feat/migration-index mapping of databases to catalog or potentially other databases

Comments

@rwforest
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Problem statement

Currently table mapping is stored in csv format, which is great for a small number of tables, it will not be scalable for thousands of tables in a single workspace. Csv files are also difficult to manage for versioning.

Proposed Solution

Allow users to store the mapping into a queryable storage, like DB SQL. It will enhance the ability for collaboration

Additional Context

No response

@rwforest rwforest added enhancement New feature or request needs-triage labels Apr 16, 2024
@github-project-automation github-project-automation bot moved this to Triage in UCX Apr 16, 2024
Copy link
Collaborator

nfx commented Apr 16, 2024

The whole idea of CSV file in the workspace is that people can edit it: in the browser, excel, google sheets, etc. Databricks doesn’t have a UI for editing data row-by-row

@rwforest
Copy link
Author

@nfx DB SQL is just an example. SQLite is another example, but it will introduce additional dependencies. If you tracked issues with Excel, you know it's not ideal. They can edit in Excel, or whatever, but still be able to push back and merge, keeping a single version of the truth. Otherwise people will end up sending csv / xls / google sheets to each other.

@nfx
Copy link
Collaborator

nfx commented Apr 17, 2024

We might want to add a small WebUI through a localhost webserver, but it's not yet a priority

@rwforest
Copy link
Author

@nfx makes sense, but I'd suggest giving some guidelines on how to manage versioning and single source of the truth. You can't expect every team in every organization got a clean mapping of their before and after. With an additional layer of catalog, a lot of people are taking advantage of this process to redesign their data catalog. Information is all over the place, like in sharepoint site, someone's inbox, etc. Downstream will suffer badly if we don't have a single version of the truth.

@nfx
Copy link
Collaborator

nfx commented Apr 18, 2024

The CSV file per workspace is the source of truth

@nfx nfx added feat/migration-index mapping of databases to catalog or potentially other databases and removed enhancement New feature or request needs-triage labels Apr 22, 2024
@nfx nfx moved this from Triage to Quarter Backlog in UCX Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat/migration-index mapping of databases to catalog or potentially other databases
Projects
Status: No status
Development

No branches or pull requests

2 participants