Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat Redirected Go Packages as Removed #3358

Merged
merged 2 commits into from
Apr 11, 2024

Conversation

mikeyoung85
Copy link
Contributor

@mikeyoung85 mikeyoung85 commented Apr 10, 2024

This PR updates the check_status logic for Go packages to treat a redirect from pkg.go.dev as a "Removed" package. The reasons for this change are to help capture when packages are renamed or recased and mark the now incorrect one appropriately. This is just an idea, so feel free to disagree with this approach.

One example is:
https://libraries.io/go/github.com%2FAzure%2Fazure-pipeline-go vs https://libraries.io/go/github.com%2Fazure%2FAzure-pipeline-go which both reference github.com/Azure/azure-pipeline-go. This PR would mark github.com/azure/Azure-pipeline-go as removed.

> url = "https://pkg.go.dev/github.com/azure/Azure-pipeline-go"
> response = Typhoeus.get(url)
> response.response_code
=> 302
> response.body
=> "<a href=\"/github.com/Azure/azure-pipeline-go\">Found</a>.\n\n"

@@ -645,7 +645,7 @@ def check_status
response = Typhoeus.get(url)
if platform.downcase == "packagist" && [302, 404].include?(response.response_code)
update_attribute(:status, "Removed")
elsif platform.downcase == "go" && [400, 404].include?(response.response_code)
elsif platform.downcase == "go" && [302, 400, 404].include?(response.response_code)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based off the logic in the description, should all packages that redirect be marked as "Removed"?

Also, maybe not for this PR, but thoughts about moving the platform specific statusing logic into their own classes vs a giant if/elsif/else statement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based off the logic in the description, should all packages that redirect be marked as "Removed"?

pkg.go.dev is a little opaque, but based on my observations: if people move a Go Module, the package still shows up on "pkg.go.dev", so they file a PR with golang to remove that package. This works, but the golang team seems to recommend putting a "retract" statement for all the old versions in the new location's go.mod, which will then remove the old location from "pkg.go.dev", e.g. golang/go#62726 (comment)

based on that, it seems safe to mark these as "Removed", until we have some sort of redirection state in Libraries to note the new locations.

Also, maybe not for this PR, but thoughts about moving the platform specific statusing logic into their own classes vs a giant if/elsif/else statement?

we could replace check_status_url with a check_status in each package manager adapter class, and implement it there?

@@ -213,27 +213,45 @@
context "recently created" do
let!(:project) { create(:project, platform: "Go", name: "github.com/some-nonexistent-fake/pkg", status: nil, created_at: 1.hour.ago) }

it "should not mark it as Removed" do
it "should not mark it as Removed for a 404" do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎱 Thoughts about cleaning up the specs to standardize the mocking / separate out the setup from the test expectations?

Copy link
Contributor

@tiegz tiegz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change seems fine, although I don't think it'll cover 100% of badly cased packages. For instance here's a valid Go Module (has a go.mod) where I've entered the wrong casing on pkg.go.dev, but pkg.go.dev still serves up the package without a redirect:

https://pkg.go.dev/github.com/go-sql-driver/mysQL

(but maybe we've fixed the Go ingestor far enough that it'll use the proper name github.com/go-sql-driver/mysql instead when ingesting?)

@mikeyoung85 mikeyoung85 merged commit d1800b0 into main Apr 11, 2024
3 checks passed
@mikeyoung85 mikeyoung85 deleted the my/set-removed-for-302-go-packages branch April 11, 2024 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants