-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support subprojects in a poetry project #2270
Comments
I recently went through converting over a mono repo with several packages over to poetry, and thought it might be useful to share what we did, and pain points and bug work arounds. Although also recognizing this proposal would hopefully make it all obsolete :-) Still this might provide some utility to those who want to do mono repos prior to native support in poetry. first a few context/caveats, we don't use namespace packages vs a common prefix, and our fs layout is little different. that's non material to the techniques used, but perhaps relevant to the proposal.
at the moment all the packages under tools have dependencies on the main package declared as a path based dev-dependency. [tool.poetry.dev-dependencies]
# setup in tree as a dev dependency
c7n = {path = "../..", develop = true} i attempted to resolve it as a normal dependency caused a few issues with poetry build (issues #2046, partial fix #2047, also reported/pr by others). so using as a dev dependency worked but also meant not using poetry directly as a build/publish tool to work around those issues and still needed the injection of the main_pkg as a regular project dep when publishing. we ended up using poetry metadata/api to generate setup/requirements for that purpose, converting dev dependencies to regular dependencies in the process. https://github.com/cloud-custodian/cloud-custodian/blob/master/tools/dev/poetrypkg.py#L121 unrelated to multi-project, but to the generation workaround, we ran into another issue that in that the masonry sdist builder didn't really support markdown readmes (pr #1994) for handling ergonomics simplicity around multiple commands that needed to update versions/ or release, we added in makefile targets to frontend, pkg-update:
poetry update
for pkg in $(PKG_SET); do cd $$pkg && poetry update && cd ../..; done One interesting consequence of source directory dependencies in poetry is that it break any attempts to distribute/publish a package even if they are dev deps. ie. per the pyproject.toml spec is that via the build-system PEP, poetry will be invoked during install. The invocation/installation of poetry as a build sys is transparently handled by pip. Simple resolution/parse of pyproject.toml dev dependencies will cause a poetry failure for an source distribution install, as installation of an sdist, is actually a wheel compilation. As a result of this as a publishing limitation we only publish wheels instead of sdists which avoids the build system entirely, as a wheel is extractable installation container/format file. we're also maintaining compatibility with tox/setuptools ecosystem for compatibility with developer workflows, there's a few more details on what we did here |
@kapilt thank you writing that up. It is extremely useful and insightful. |
This proposal is valuable. As it is, poetry supports optional dependencies, but not optional packages The use of optional packages for a namespace project is really useful. 👍 for including the optional-package as part of this proposal. |
shared dependencies are very useful, but might make sense to inherit some of the logic from Maven regarding the shared block:
while it does complicate things the benefits are:
|
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
In CI, since the packages are now installed in a virtualenv rather than globally, we have to activate the virtualenv at the start of jobs. The cattrs package has to be kept back to before version 1.1.0 until we either upgrade to Python 3.7 or AWS CDK moves to a dependency which doesn't transitively depend on Python 3.7. Issue linked on the relevant pyproject.toml line. This merges the production dependencies of the backend and infra subdirectories, which is not ideal. Once subproject support <python-poetry/poetry#2270> arrives we should pull this apart, maybe keeping only the development and test dependencies in the root. To mitigate this in the meantime, bundle.bash pulls out only those dependencies which the Lambda function needs.
Anything new on this? |
Not sure. Just coming across all of this for the first time. So I am looking forward to it! |
I've been testing out some monorepo approaches and started with @adriangb's approach here. DX is the main issue -- challenges with this approach surround
It would be nice to:
Ideally, poetry or the plugin could find the root lockfile/pyproject.toml, or there could be some way that the developer specifies it. This would lead to a similar experience to cargo, yarn, and npm. |
I think a plugin would solve all of those issues and should be doable. I haven’t written one just because the DX isn’t bad enough for me to justify spending time on it. And I usually don’t end up running poetry … from a subproject, most things happen from the top level Makefile. |
@adriangb what is your solution for replacing path dependencies with regular ones when publishing? |
There's a thing called Polylith that has a different take on the problems of monorepos and sharing code, than the suggested solutions in this thread. But I think that it could be interesting to share this approach for you here. Third-party dependencies are a thing of its own, but the code that we have control over in the projects is different. It can be shared across projects in quite a simple way by using a Monorepo with a developer experience similar to a single-project repo. In Polylith, there's no symlinks or other quirks needed (unless you view the plugins as quirks). Having the code organized as namespace packages - just as with single-project repos - and the individual projects including what is needed by using the To make this work in a Poetry context, there is the MultiProject plugin, as mentioned above by @tnielens. That plugin makes it possible to use relative includes (in the Having something to visualize the code in a Monorepo is probably helpful, and that is where the tooling support for Polylith comes in. There's several commands to visualize, calculate diffs, synchronize projects and create Python code according to the Polylith Architecture. The tool is, of course, Open Source 😄 I hope this helps! |
I don’t have a solution because it’s not a used case I’ve had. I imagine a plug-in could do something similar to the scripts I’ve seen. |
I second David's plug for the amazing polylith plugin. We have successfully incorporated polylith to structure our repo and couldn't do without it. We are a ML shop and have many teams working on different problems but sharing a common code base. The builds are streamlined to only contain what each project needs so deployments are very thin and rarely cause issues. Highly recommended! |
I'm not quite sure why everything is complicated here. This problem has been solved long time ago at the framework level in languages such as C# with nuget management that can consume the local packages and also download from package source in production (Wheel and source combination). |
As you are writing, Poetry is Open Source. Maybe you are the one that should solve this long-running issue @moattarwork?😄 |
I would absolutely love for @moattarwork to step up and put some much-needed sweat toward solving this problem for me |
Previously, I worked around it using some bash scripts to get named dependencies in the built artifacts, which I show in this demo repo and this blogpost Therefore, I've developed a Poetry Plugin for Mono Repo dependencies at https://github.com/gerbenoostra/poetry-plugin-mono-repo-deps/ |
@gerbenoostra this is awesome, I will try this for my work! |
@gerbenoostra Like the idea of using a plugin, thanks ! However, i couldn't make it work and there are no errors :/. Do you have any examples/documentation ?
|
@soufea thanks for trying out! I'd love to help, but perhaps best to continue our discussion on the plugin's issue tracker? If you have an example repo online, I can check. As a preliminary response, an example project is in the test fixtures folder.
Ideally I'd also create a plugin which implements the composition aspect of mono repo's, that when you run |
Sorry guys. I was away for sometime but I'm happy if I can work on this. I'm not familiar with the code base but comparing this with the closest eco-system (NodeJs), I think the complexity lies in the toml file being a solid project file. Although the project file contains information such as name, version, author, etc that uniquely identifying the project itself, however the dependencies and dev-dependencies could be inherited. Solving this problem would be the first step toward implementation of a good mono-repo and the rest can be managed with standard templating. |
Thanks for sharing this. I will take a look |
I took the liberty to create a plugin that supports these commands and it allows users to have a single lockfile + shared venv. Hope this helps! You can find it here: https://github.com/ag14774/poetry-monoranger-plugin |
Background & Rationale
This request is inspired by RPM Package Manger’s capability to build subpackages from the same Spec File.
Here, I want to propose and discuss replication a version of this capability can be replicated within poetry to allow for simplified user experience for a python project maintainer, especially when either maintaining namespace packages and/or multi-project source trees. While strict project separation is a good thing in most cases, it might not always be the more pragmatic scenario for package maintainers.
For our purposes here, we can refer to each of theses packages as a subproject. And all subprojects are managed under a single poetry project. This means that there is only a single
pyproject.toml
file and a shared project root directory with either a shared source tree or independent source trees (subdirectory) for each subproject.Description
Let us consider the scenario of multiple namespace packages being maintained in a single repository with the following structure.
Note that this will still apply even if different source directories exists within the root directory for each subproject.
Here the intention could be that we want to distribute 3 packages, namely,
namespace-package-one
,namespace-package-two
andnamespace-package-three
.For the purpose of this example, let us assume that
namespace-package-three
depends onnamespace-package-one
. Thepyproject.toml
file could look something like this.New sections are annotated with comments detailing them and expected behaviour.
Under this scenario, the following might be what the cli commands look like. Current behaviour will remain unaltered as these are additive changes.
Variations
The above is an initial though of how it might work. That said there are variations to this that should be discussed.
Does a per-package
dev-dependnecy
section make sense?This only really makes sense if we want to allow for developing a single package at a time. However, this will become tricky in cases like here where "three" depends on "one". This will mean that when developing "three", dev dependencies for "one" should also be installed. If isolation is required, then multiple virtual environments will be required, which might be overkill for majority use cases for this feature.
Will all packages be installed under PEP-0517?
Is it even possible to install only specific package when being installed under PEP-0517? One possible solution might be to make use of "extras" here as a way of specifying which package if any to install, but default to all.
Extensions
As an extension to this, one might also want to optionally distribute a a namespace only package
namespace-package
(let's call this the "project package" for now) that installs the core dependencies and also allow for "extras" as we do today without requiring the distribution of the entire source tree with the binary distribution.This means that if someone does
pip install namespace-package
, the maintainer might expect the the following to be installed:namepace.package
.namespace-package-one
andnamespace-package-three
, which are required for the "default" install.An end-user can also install the remaining package, like so -
pip install namespace-package[two]
which simply will install a dependencynamespace-package-two
.This behaviour might not be desired in all cases, and can be considered opt-in.
The text was updated successfully, but these errors were encountered: