Skip to content

Fix fetching bug for requirements.in files#5580

Merged
jeffwidman merged 2 commits intodependabot:mainfrom
stulle123:main
Sep 13, 2022
Merged

Fix fetching bug for requirements.in files#5580
jeffwidman merged 2 commits intodependabot:mainfrom
stulle123:main

Conversation

@stulle123
Copy link
Copy Markdown
Contributor

Hi there,

There seems to be a bug in the file_fetcher.rb class when trying to fetch requirements.in files:

If there's also a requirements.txt file in the repository, the download works.

If not, Dependabot doesn't continue to look for requirements.in files and raises a DependencyFileNotFound exception here.

When I add requirements_in_files.any? to the list of checks the download works just fine. Maybe, there's a better fix for this, didn't investigate the issue any further.

Thanks.

@stulle123 stulle123 requested a review from a team as a code owner August 24, 2022 12:42
@stulle123
Copy link
Copy Markdown
Contributor Author

Regarding the 2nd commit, would it be possible to increase the file size of the *.in and *.txt files to 300 kilobytes?

Especially in large projects, compiled requirement.txt files can get pretty big. Thanks!

@jeffwidman
Copy link
Copy Markdown
Member

jeffwidman commented Aug 24, 2022

👋 Hey @stulle123 thanks for the PR!

A few quick questions:

  1. Can you explain the scenario where you've committed a requirements.in file but not a requirements.txt file? I personally always considered it a best practice to commit both... that's kinda the point of generating a fully pinned requirements.txt file. But if there's a valid scenario to commit only the *.in file, then your change seems reasonable to me.
  2. The file size limit is because for python we pull in any .txt file, and and only then we try to figure out if it’s a requirements file, because they can have any random name. So we wanted to minimize pulling in large docs files etc...
    Are you actually hitting the 200KB limit in your jobs? I'd be curious to see the repo if it's public--do you have a Job ID?
    If you are hitting the limit, then yes, we should raise it, but it'd be best to do that in a separate PR as the two issues here are orthogonal.
  3. Both PR's need tests. If you need help figuring out how to write them give a shout and I'll try to help.

BTW, I noticed this is your first PR on GitHub, so congrats! 🥳 I still remember the first time I opened a PR for an open-source project, a little scary, but I was thrilled when it was merged.

@dws
Copy link
Copy Markdown

dws commented Aug 24, 2022

It would be even better if the file size heuristic could be configurable, so that it would be possible to override the default size cap when it's not appropriate.

@deivid-rodriguez
Copy link
Copy Markdown
Contributor

I was thinking that if we're going to make this configurable, it makes more sense to make the name of the manifest configurable, so that we don't have to make any guesses, fetch files that we don't need at all, or put any size limits on what we fetch.

We'd probably need to keep the current logic for backwards compatibility, though.

@stulle123
Copy link
Copy Markdown
Contributor Author

Hey @jeffwidman thanks for the quick reply and warm welcome!

Regarding your questions:

  1. I have both requirements.in and requirements.txt files in the repo. Just in the case of an error (e.g., file size of requirements.txt is too big ;-)) it makes sense to "fall-back" to the requirements.in file. IMO, that additional check doesn't hurt much.
  2. Yeah, I'm hitting the limit of the 200KB (that was the reason why I came across the issue above in the first place). Having this configurable as @dws mentioned would be ideal. For now, I can split up the PR, no problem.
  3. Regarding unit tests, can you point to some examples / documentation?

Cheers

@stulle123
Copy link
Copy Markdown
Contributor Author

Hey @jeffwidman!

I've just split up this PR into two.

That's the one changing the file size of requirements.txt files to 500 kilobytes --> #5596.

Can you please help me on how to add unit tests for the two changes?

Thanks!

@jeffwidman
Copy link
Copy Markdown
Member

Hey @stulle123 thanks for splitting them out here.

For this one, the test should be relatively straightforward. Look in this file: https://github.com/dependabot/dependabot-core/blob/main/python/spec/dependabot/python/file_fetcher_spec.rb

I'd grep for "with only" and you'll see how the tests were written for other file types. Just need to add one saying "with only a requirements.in".

You can run the tests natively using these instructions, but the easier way to avoid having to muck around with installing the proper ruby libraries locally is to use the docker image. Instructions for booting the image are here, and a few paragraphs in there's a section on running tests within the docker image.

@stulle123
Copy link
Copy Markdown
Contributor Author

Hey @jeffwidman just added the unit test! Cheers!

Copy link
Copy Markdown
Member

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you!

@jeffwidman
Copy link
Copy Markdown
Member

@stulle123 sorry to bother you, but looks like this code is failing a lint error about line too long, probably because we tidied up our Rubocop settings in #5588.

Do you mind fixing? Should be straightforward.

Also, while you're in there, can you squash to a single commit? that way the test + the change are located in the same commit for anyone in the future doing debugging via git blame.

@stulle123 stulle123 closed this Sep 13, 2022
@stulle123 stulle123 reopened this Sep 13, 2022
@stulle123
Copy link
Copy Markdown
Contributor Author

Hey @jeffwidman Rubocop checks are green now. As you can see I had some troubles with the commit squashing, but I somehow managed.

Copy link
Copy Markdown
Member

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I tweaked it slightly so that it doesn't have two return values,
as that can start to imply there's two groups of things, rather
than that they're all on an equal footing.

stulle123 and others added 2 commits September 13, 2022 15:55
Use a single `return` to make the intent clear that
all these files should be treated the same, they don't
comprise two groups.
@jeffwidman jeffwidman merged commit f46e6af into dependabot:main Sep 13, 2022
jeffwidman added a commit that referenced this pull request Sep 24, 2022
We run CI via actions on `main` on a daily cron to catch any unexpected regressions, or if CodeQL adds a new check that suddenly flags something.

However, this can result in some unexpected notification noise:
1. I merged a PR from an external contributor: #5580
2. That PR was opened from the fork's `main` branch
3. Before merging, I clicked the "Update by rebase". So a commit from me landed on the fork's `main` branch.
4. I'm now receiving daily emails about failing CI runs from the fork. These are pure noise, since I don't care about the state of an external contributor's forked repo.

This hasn't happened for any other PR's that I've merged from external contributors, so I suspect it's related to the rebase so I'm now seen as a committer to the fork's `main` branch. Not really sure though.

I double-checked and I am _not_ subscribed to notifications on the repo. I am aware that after 30 days this should age out because actions are disabled for forks after 30 days assuming the external contributor doesn't update his fork/click re-enable actions, but it's still annoying.

I checked internally with the Actions team, and they said this is known behavior, and it's complicated.

So for now, let's default to disabling running cron for forks.
@pavera pavera mentioned this pull request Oct 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants