Dependency parser to find repos #1686

schneems · 2022-10-05T15:06:57Z

Problem

People don’t know what repos to subscribe to. My best suggestion is to look in their dependencies (such as Gemfile.lock, package.json, or Pom.xml)

For now I’m asking them to manuall look in their dependencies and then search for repos on CodeTriage. I would love to automate this process.

I’m thinking it looks like this: the user uploads a file then based on the name we run an extraction script that returns the dependencies in the file in something machine readable (outputting it in json to STDOUT and any errors or warnings to STDERR).

$ cat Gemfile.lock | parse_gemfile.rb
{ 'repos': [ { 'name': "activerecord" } ], 'language': "ruby" }

Then we can take that information to search of a project with that language and name exists in the database.

Ruby tooling - Task zero

We can use the lockfile parser that ships with Ruby to do this for Ruby. See the heroku/heroku-buildpack-ruby for some inspiration.

Ruby (Gemfile.lock)

In addition to doing this code I also would like a test file setup where we can put future parser tests. That will make it easier for future PRs to be tested (if they just have to copy an existing test instead of learning Ruby/Rspec if they don't know it.

Place the test in test/dependency_parsers/ruby_parser_test.rb

I still want the code to be implemented as a script. Put it in a folder under lib/dependency_parser/ruby/parse.rb. The input to the script will be the file contents via STDIN. I want json to STDOUT and errors to STDERR. Failures should result in non-zero exit code

Dependency/lockfile parsing for other languages- Task one

Ideally we won’t need to install other tooling, for example installing maven would be overkill. It takes a bunch of time to install and eats up a bunch of space on disk. This would slow down deploys.

I'm thinking we need to extract the information in either Ruby or Bash or Javascript as these are the three main languages already on the app.

We need a parser for these languages:

Each of these can be in a different PR. Put them in a folder under lib/ named dependency_parser/<language>/parse.<extension>. When sending a PR please give me an example input and expected output.

The script needs to defensively check for invalid input and output a helpful error message if:

It's formatted incorrectly
Is missing critical information
Is empty

If a script cannot continue then it needs to exit with non-zero status code.

If a test/dependency_parsers/ruby_parser_test.rb already exists, copy it to ``test/dependency_parsers/ruby_parser_.rb` and try to fill it out to the best of your ability. If it does not please provide me with an input to your script and an expected output.

Upload and Storage - Part 2

It might be that a project isn't yet added to CodeTriage but might be in the future. As a future proofing method we can ask users if they want us to store their dependency file.

We need a webpage in Rails where people can upload a file.

For now we can store the whole raw contents in a new table It should be linked to a specific user, include the original filename, it's contents, and a label/name (i.e. people might want "my side project" or "my work project").

Give people a UI so they can manage them (CRUD).

Integration with CodeTriage - Part 3

We need a way to consume the scripts in a way that users can utilize:

After a lockfile is uploaded, pass it to the appropriate script based on a mapping of filename to script.
Execute that script
Then based on the output, query the database and see if there are matches with the output
Show them to the user on the results page
Bonus: If there are multiple dependencies, map the name of the dependency label to the suggested repo.

The text was updated successfully, but these errors were encountered:

schneems added the hacktoberfest label Oct 5, 2022

khasinski mentioned this issue Oct 5, 2022

Add Gemfile.lock parsing for dependency parser #1689

Merged

6 tasks

schneems mentioned this issue Oct 6, 2022

Update after signup wizard to encourage subscribing to repos #1700

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dependency parser to find repos #1686

Dependency parser to find repos #1686

schneems commented Oct 5, 2022 •

edited

Loading

Dependency parser to find repos #1686

Dependency parser to find repos #1686

Comments

schneems commented Oct 5, 2022 • edited Loading

Problem

Ruby tooling - Task zero

Dependency/lockfile parsing for other languages- Task one

Upload and Storage - Part 2

Integration with CodeTriage - Part 3

schneems commented Oct 5, 2022 •

edited

Loading