Fix poetry secondary source bug #4323

isobelhooper · 2021-10-15T10:53:21Z

Fixes #4026 (hopefully).

PyprojectPreparer made sure to get all of the originally-defined Poetry sources (with their original names), and any sources configured based on python-index sources in Dependabot's config (with short random hex strings as names).

However, it then used these to generate poetry.lock without trying to combine them, so packages in private repos would end up with the short random hex string as their reference in poetry.lock, which would not work.

This PR merges any configured sources into the originally-defined Poetry sources in PyprojectPreparer.replace_sources, so that the final sources should have the original name they had in pyproject.toml.

Please let me know if there are any code style or other changes you'd like made to this. Ruby isn't my main language, but I'm happy to change whatever's needed.

…Finder tests

…uthed URL

…for lock files ...in the case where the [tool.poetry.source] is for a private repository.

…entials in URL This requires some more work because the authed URL won't match the original URL, so we have to match on the original URLs and merge in changes.

jurre · 2021-10-15T15:53:17Z

Thanks for digging into this! I'm going to let CI run, I'll need to dig into the changes a bit and spotted some style/ruby specific things, but overall looks good from a quick scan

jurre · 2021-10-15T16:32:49Z

python/lib/dependabot/python/file_updater/pyproject_preparer.rb

-          poetry_object["source"] = sources if sources.any?
+          sources_hash = pyproject_sources.map { |source| [source["url"], source] }.to_h
+
+          for source in config_variable_sources(credentials) do


Somewhat nitty, but using for is discouraged in ruby, each is faster and more idiomatic. Apologies for the nitpick 😬

Suggested change

for source in config_variable_sources(credentials) do

config_variable_sources(credentials).each do |source|

No worries! I'll change this as part of fixing what rubocop flags in the next commit.

jurre · 2021-10-15T16:35:16Z

python/lib/dependabot/python/file_updater/pyproject_preparer.rb

+          sources_hash = pyproject_sources.map { |source| [source["url"], source] }.to_h
+
+          for source in config_variable_sources(credentials) do
+            if sources_hash.has_key?(source["original_url"])


Suggested change

if sources_hash.has_key?(source["original_url"])

if sources_hash.key?(source["original_url"])

I believe has_key is deprecated

Will also fix this as part of the rubocop fixes. 👍

python/spec/dependabot/python/file_updater/pyproject_preparer_spec.rb

jurre · 2021-10-19T08:07:50Z

This looks great, one thing I want to verify is that this doesn't cause the username/password to be committed as plain text when running this end to end

isobelhooper · 2021-10-20T16:53:32Z

This looks great, one thing I want to verify is that this doesn't cause the username/password to be committed as plain text when running this end to end

Is this something that I should address with more tests, or is it something that you'd have to run yourself? Am happy to make more changes as needed.

jurre · 2021-11-01T10:24:44Z

Is this something that I should address with more tests, or is it something that you'd have to run yourself?

@isobelhooper apologies for the slow response on this! It would be great to add a few tests to python/lib/dependabot/python/file_updater/poetry_file_updater.rb, as we call the changed code from there. We want to make sure that the updated files do not contain hard-coded secrets after that code has been called. If you happen to have a project I could manually test this against, that'd also be much appreciated! I can set one up if not though

isobelhooper · 2021-11-12T16:20:52Z

Is this something that I should address with more tests, or is it something that you'd have to run yourself?

@isobelhooper apologies for the slow response on this! It would be great to add a few tests to python/lib/dependabot/python/file_updater/poetry_file_updater.rb, as we call the changed code from there. We want to make sure that the updated files do not contain hard-coded secrets after that code has been called. If you happen to have a project I could manually test this against, that'd also be much appreciated! I can set one up if not though

I'm afraid I don't have a project I could test this against - setting one up would be much appreciated!

I had a try at writing tests for poetry_file_updater.rb using Webmock for the PyPI responses, but ran into the problem that the updater calls out to poetry update and Webmock can't mock out the requests that it makes. A test project would be good for this.

EDIT: However, you reminded me that I could test locally against one of our private repos and then change code to use a different one, which I'll do so I can keep working on this.

This was tested with a local instance of pypiserver with an empty packages/ directory, which meant it was just proxying calls to pypi.org. The credentials used in testing were set up using the pypiserver instructions at https://pypi.org/project/pypiserver/#apache-like-authentication-htpasswd. This might be feasible for Dependabot's own testing, as you could run this as part of the setup, but there are probably better alternatives.

isobelhooper · 2021-12-13T10:21:52Z

Apologies for the slow response on this - I've pushed another commit with a test in python/lib/dependabot/python/file_updater/poetry_file_updater.rb that will, unfortunately, currently not work.

At the moment, it relies on having a Python repository running at http://localhost:8080 that is set up to allow username / password test / test, and that hosts the luigi package.

I did this using pypiserver and following the instructions in https://pypi.org/project/pypiserver/ to set up a server running on localhost:8080 that serves no local packages and instead redirects all queries to pypi.org (which would then serve the luigi package). My test setup just ran pypiserver as a separate process, but there's a Docker image that could be used to do the same.

The new test should pass once there's a Python repository like this (with auth credentials) that it can access - though I wouldn't expect it to be on localhost, and I'd hope the credentials could be passed in as secrets rather than hardcoded.

Would it be possible for the Dependabot CI to either set up a temporary Python repository like this just for testing, or use an internal private Python repository?

jurre · 2021-12-13T11:47:32Z

Would it be possible for the Dependabot CI to either set up a temporary Python repository like this just for testing, or use an internal private Python repository?

I'm not sure, both approaches seem fairly brittle and involved 🤔 is there a way that we could mock out those calls instead?

isobelhooper · 2021-12-28T21:02:33Z

Would it be possible for the Dependabot CI to either set up a temporary Python repository like this just for testing, or use an internal private Python repository?

I'm not sure, both approaches seem fairly brittle and involved 🤔 is there a way that we could mock out those calls instead?

Unfortunately those calls are made as part of the poetry update command that's called by PoetryFileUpdater.updated_lockfile_content_for - we could mock out how Poetry updates lock files, but that also seems brittle and involved.

I'm thinking that it might be possible to move the test to a little earlier in the file updating, and avoid having to make an external call at all.

Since the only thing that updated_lockfile_content_for changes about the lockfile once Poetry has generated it is the hash, my thought is that if we make sure that the project file from PoetryFileUpdater.prepared_pyproject (which is then used to generate the lockfile) doesn't contain any credentials, then any leaking of credentials into the lock file would be a Poetry bug.

I'll have a go at making a test for PoetryFileUpdater.prepared_pyproject in the case where we have auth credentials.

...rather than testing against the updated files as in the previous commit. With this approach, we don't need to set up an actual repository with credentials, since all we are doing is checking that the credentials do not leak into the pyproject file. Since the lock file is created by Poetry itself from the pyproject file, if our pyproject file is clean of credentials when we generate the lock file, the only way credentials could then leak in would be a bug in Poetry.

jurre · 2022-01-07T10:35:26Z

Thanks for all your work on this!

isobelhooper added 4 commits October 15, 2021 11:38

Minor change: add pyproject.toml with private source to LatestVersion…

1c18f3e

…Finder tests

Combine pyproject.toml and credential sources if they have the same a…

4008ce3

…uthed URL

Add a test that PyprojectPreparer conserves tool.poetry.source names …

c62e44e

…for lock files ...in the case where the [tool.poetry.source] is for a private repository.

Correctly merge pyproject_sources and variable sources with auth cred…

f565d15

…entials in URL This requires some more work because the authed URL won't match the original URL, so we have to match on the original URLs and merge in changes.

isobelhooper requested a review from a team as a code owner October 15, 2021 10:53

isobelhooper mentioned this pull request Oct 15, 2021

Dependabot is leaking private registry credentials in poetry.lock #2487

Closed

jurre reviewed Oct 15, 2021

View reviewed changes

python/spec/dependabot/python/file_updater/pyproject_preparer_spec.rb Outdated Show resolved Hide resolved

Fix all problems flagged by rubocop in changed code

9e2b4e3

isobelhooper and others added 2 commits December 28, 2021 21:40

Merge branch 'main' into fix-poetry-secondary-source-bug

0b2f93b

isobelhooper requested a review from jurre January 7, 2022 09:56

jurre approved these changes Jan 7, 2022

View reviewed changes

jurre merged commit 4a33928 into dependabot:main Jan 7, 2022

isobelhooper deleted the fix-poetry-secondary-source-bug branch January 7, 2022 12:06

brrygrdn mentioned this pull request Jan 10, 2022

v0.170.1 #4615

Merged

itsdani mentioned this pull request Mar 4, 2022

Dependabot generates faulty poetry.lock when using a private package repository #4797

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix poetry secondary source bug #4323

Fix poetry secondary source bug #4323

isobelhooper commented Oct 15, 2021

jurre commented Oct 15, 2021

jurre Oct 15, 2021

isobelhooper Oct 15, 2021

jurre Oct 15, 2021

isobelhooper Oct 15, 2021

jurre commented Oct 19, 2021

isobelhooper commented Oct 20, 2021

jurre commented Nov 1, 2021 •

edited

Loading

isobelhooper commented Nov 12, 2021 •

edited

Loading

isobelhooper commented Dec 13, 2021 •

edited

Loading

jurre commented Dec 13, 2021

isobelhooper commented Dec 28, 2021

jurre commented Jan 7, 2022

	for source in config_variable_sources(credentials) do
	config_variable_sources(credentials).each do \|source\|

	if sources_hash.has_key?(source["original_url"])
	if sources_hash.key?(source["original_url"])

Fix poetry secondary source bug #4323

Fix poetry secondary source bug #4323

Conversation

isobelhooper commented Oct 15, 2021

jurre commented Oct 15, 2021

jurre Oct 15, 2021

Choose a reason for hiding this comment

isobelhooper Oct 15, 2021

Choose a reason for hiding this comment

jurre Oct 15, 2021

Choose a reason for hiding this comment

isobelhooper Oct 15, 2021

Choose a reason for hiding this comment

jurre commented Oct 19, 2021

isobelhooper commented Oct 20, 2021

jurre commented Nov 1, 2021 • edited Loading

isobelhooper commented Nov 12, 2021 • edited Loading

isobelhooper commented Dec 13, 2021 • edited Loading

jurre commented Dec 13, 2021

isobelhooper commented Dec 28, 2021

jurre commented Jan 7, 2022

jurre commented Nov 1, 2021 •

edited

Loading

isobelhooper commented Nov 12, 2021 •

edited

Loading

isobelhooper commented Dec 13, 2021 •

edited

Loading