-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-1380] [Feature] Namespaced packages to enable multiple package import #6113
Comments
@dstuck This is really interesting! Thanks for talking through the use case. To date, I'm familiar with a few patterns that agencies use to provide the same basic transformation flow to a number of different clients:
Your proposal offers an alternative: What if you could have a common sub-project and a common super-project, while still preserving the one-to-many relationship between them?
I think you're onto something! First thing's first — we'd need to fix #1269, and allow for multiple models in the same dbt DAG, with the same name, so long as they exist in different project namespaces. That need is top-of-mind for us; I'd very much like to see us do it over the next several months, as part of the larger initiative discussed in #5244. Once that's sorted, you could imagine something like: # packages.yml
packages:
- local: path/to/reused/project
project_name: client_a
- local: path/to/reused/project
project_name: client_b # dbt_project.yml
models:
client_a: # scoped to this client's data
custom_config: ...
vars: ...
client_b: # scoped to a different client's data
custom_config: ...
vars: ... From a technical perspective, I think this would take a little bit of doing, starting with where we load projects, and tracing that through to the parsing of each project, to ensure we're forming a truly unique |
@jtcohen6 I really appreciate the detailed description of how this relates to ongoing work and it definitely feels like it's a request that needs namespacing to be worked out first. I hadn't actually considered that the file names would lead to collisions (to be honest having model names use the root filename always been one of the least intuitive things about dbt to me) so that's a much bigger issue than just referencing the package. The other issue I realized while getting into the weeds a bit with dispatching from packages after I posted this is that I think package developers need to include hard-coded references to their project name when referencing macros in the package which could also break my initial thought of "just change the name of the project to an alias". Very excited about the multi-project support initiative as that would be another approach to work around this issue by using different projects for the multi-tenant marts that could still maintain dependencies with the common data. |
Dispatching! That's a really good point. This pattern might make more sense for "model packages" versus "macro packages," which I'm increasingly convinced are different patterns for code reuse. |
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days. |
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers. |
Is this your first time submitting a feature request?
Describe the feature
We would like to import a package multiple times in a single project. An example use case is where you have a common data product that needs to be exported/shared with multiple clients with slightly different config or a project with a common set of sources that branch out into multi-tenant mart with overlapping logic. We would like to be able to define that common product in a package and import it once for each client with client_prefixed schemas.
I believe this could be achieved by allowing a project alias to be configured in the package that would be used as the project_name after loading the package. The main unknown for me is whether the Project class gets loaded from the dbt_project.yml in the package files outside of the deps call in which case the alias would need to be written into that file to replace the project name which is a bit more invasive.
Describe alternatives you've considered
Some alternatives that allow us to be drier than just "copy paste everything"
Who will this benefit?
This enhancement will benefit teams that support multitenant architectures or create standardized data products from a common source.
Are you interested in contributing this feature?
I would be interested in contributing though haven't before so could use some guidance
Anything else?
No response
The text was updated successfully, but these errors were encountered: