Skip to content

Introduce Dependabot::DependencyChange as a wrapper for updater output plus metadata#6792

Merged
brrygrdn merged 13 commits intomainfrom
brrygrdn/encapsulate-updater-output
Mar 7, 2023
Merged

Introduce Dependabot::DependencyChange as a wrapper for updater output plus metadata#6792
brrygrdn merged 13 commits intomainfrom
brrygrdn/encapsulate-updater-output

Conversation

@brrygrdn
Copy link
Copy Markdown
Contributor

@brrygrdn brrygrdn commented Mar 6, 2023

This PR introduces a new Dependabot::DependencyChange class which wraps a set of Dependabot::Dependency and Dependabot::DependencyFile objects which constitute the outcome of a Dependabot::Updater operation.

Rationale

As part of the prototype effort, Rubocop flagged the growing number of parameters being passed down into the API Client - effectively the 'output' class for the Updater - when we started adding in new placeholders for the concept of a "Group Rule".

This problem indicates we have a missing abstraction between the operation to generate a change and the components when then transform it for output via Dependabot API.

Our next major planned refactor is to turn the Dependabot::Updater operations into concrete classes so each combination of major job branches is represented by a single, well-defined class for a fresh Version Update, a rebase of an existing Version PR, a new Security Update, etc, etc.

The Dependabot::DependencyChange class is a bottom-up refactor that adds a new abstraction which reduces the amount of parameters we need to pass while also providing a starting point for the output of these new operation classes to force a wedge between the part of the Updater which constructs change sets and the part which processes them into PRs so we can start to expose the injection site for grouping behaviours.

Changes

Originally I intended the Dependabot::DependencyChange to be built up over the course of the Dependabot::Updater#check_and_create_pull_request and Dependabot::Updater#check_and_update_pull_request methods, allowing more of the logic which combs and filters the "updated_deps" list to be moved into the class itself.

I still think this might be worth doing, but I walked it back as it resulted in a substantial number of test stub rewrites which made me lose confidence in the change.

For now, the only major responsibility the Dependabot::DependencyChange class extracts from the Updater is responsibility for constructing the Pull Request message object, which it does by being aware of the Dependabot::Job.

I think there are some logical threads that follow on from this to incorporate things like the @created_pull_requests and @errors sets unto the job so the change set is capable of answering questions about wither it duplicates or supersedes existing work - but I've left that for now to de-risk introducing this interface and avoid delaying splitting the Dependabot::Updater out further.

@brrygrdn brrygrdn force-pushed the brrygrdn/encapsulate-updater-output branch from 37d9a89 to 1662457 Compare March 6, 2023 20:32
@brrygrdn brrygrdn marked this pull request as ready for review March 6, 2023 20:56
@brrygrdn brrygrdn requested a review from a team as a code owner March 6, 2023 20:56
@brrygrdn brrygrdn force-pushed the brrygrdn/encapsulate-updater-output branch from ccf5478 to dd5cfa9 Compare March 6, 2023 20:56
end

def create_pull_request_data(dependencies, updated_dependency_files, base_commit_sha, pr_message)
def create_pull_request_data(dependency_change, base_commit_sha)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method has gotten big enough I think it's probably worth extracting it into a serialiser class which presents a dependency_change for the API in a neater way.

I've punted on this for now as I want to land the concept of DependencyChange and I'm conscious it might change during code review.


def create_pull_request(dependencies, updated_dependency_files,
base_commit_sha, pr_message)
# TODO: Make `base_commit_sha` part of Dependabot::DependencyChange
Copy link
Copy Markdown
Contributor Author

@brrygrdn brrygrdn Mar 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've punted on this for now because how we obtain the base_commit_sha is a little unusual. I think for the purposes of Dependabot::Updater it should always be present on the job, but I need to verify this because the job is shared between the Updater and Fetcher, but has slightly different parameters in each.

return data unless dependency_change.pr_message

data["commit-message"] = dependency_change.pr_message.commit_message
data["pr-title"] = dependency_change.pr_message.pr_name
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newbie q: can these (e.g. "pr-title") be constants?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, they could me, I think we typically don't assign constants to string keys for hashes as they are normally only used in one place.

Copy link
Copy Markdown
Contributor

@landongrindheim landongrindheim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ❤️ this change!

end

def updated_dependency_files_hash
updated_dependency_files.map(&:to_h)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is simply porting over current behavior, in which case we can return to this conversation later 🔖

  1. Are we certain these files are in the required shape to have #to_h called on them?
  2. Are there concerns around file name collisions?
  • concern: If we have two Gemfile entries, one would be lost when being set as a Hash key.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is porting of existing behaviour!

It assumes a starting datatype of Array<Dependabot::DependencyFile> converting to Array<Hash>.

Since these files are bound to a single directory ( effectively by the Dependabot::Job, but that's a very implicit ), they should not have name collisions, or at least not name + directory collisions in the case where Dependabot finds something like spec/fixture/Gemfile in the path in addition to Gemfile.

We do not currently de-duplicate the resulting Array of Hashes, nor merge them so we should be 👌🏻 overall were collisions to occur, but that's a good call out as we don't heavily validate our inputs right now.

Comment on lines +38 to +40
# TODO: Promote this value to a constant or similar once we have
# updated core to avoid surprise outcomes if this is unset.
github_redirection_service: "github-redirect.dependabot.com"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌 This makes sense to me. We could make it a constant and provide it as the default argument 👍

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm moving the existing comment in this case, I would address this but I've forgotten what this means since I wrote the original comment:

once we have  updated core to avoid surprise outcomes if this is unset

I'm not sure that has happened yet?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #6801 to address this 😄

Comment on lines +12 to +13
module Dependabot
class DependencyChange
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really digging this extraction! 👏 on noticing the need and finding a sane way to address it 🙇

I can tell you've spent time thinking about what this object should include/exclude. I find myself wondering whether job is part of a dependency change. My thinking is that a job might have a dependency change, while the rest of the parameters are parts of a dependency change. 🤔

I see job is only referenced in #pr_message. Given that, we could let it be an argument to that method. I kinda like that: dependency_change.pr_message(for: job)

I'll trust your judgment here. I think this is a big improvement, and points us in a really positive direction 🥇

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm 50/50 on this class knowing about the Job - I think of it as "I know the context in which this change was made" as the base_commit_sha could conceivably be retrieved from it as well and the Updater#existing_pull_request could sink into DependencyChange as well if the Job maintained the list of Pull Requests created so far.

Essentially I think this feels untidy as it is an unfinished piece; I think this class should know about the 'context' in which it is created, the job is part of that context and good enough to use for now.

The thing I'm not sure about is, should the Job be the context or should the Job and some other things be split out into a new class representing just context?

Comment on lines -192 to +199
create_pull_request(updated_deps, updated_files, pr_message(updated_deps, updated_files))
create_pull_request(updated_deps, updated_files)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌

Comment on lines +96 to +98
before do
allow(job).to receive(:updating_a_pull_request?).and_return(false)
end
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤙 Callback to an earlier comment, I think being able to pass in job to #pr_message would make this test really simple.

Copy link
Copy Markdown
Contributor Author

@brrygrdn brrygrdn Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good call out. The API client currently doesn't know about the job, but it does know the job_id and job_token - it might make sense to instantiate it with the Job so it can use what it needs?

This might violate the principal of least knowledge a little if Job becomes the overall "context" class as I mentioned above, maybe this is a good reason to split job and context?

it "updates dependencies correctly" do
expect(api_client).
to receive(:create_pull_request) do |deps, files, commit_sha|
to receive(:create_pull_request) do |dependency_change, commit_sha|
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤤

@brrygrdn brrygrdn force-pushed the brrygrdn/encapsulate-updater-output branch from dd5cfa9 to 42c6ca6 Compare March 7, 2023 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants