Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/pkgsite: API for pkg.go.dev #36785

Open
rhcarvalho opened this issue Jan 26, 2020 · 58 comments
Open

x/pkgsite: API for pkg.go.dev #36785

rhcarvalho opened this issue Jan 26, 2020 · 58 comments
Labels
FeatureRequest NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. pkgsite

Comments

@rhcarvalho
Copy link
Contributor

Prior to pkg.go.dev, godoc.org has had a JSON API that can be used to, among other things, discover importers of a given package.

Example: https://api.godoc.org/importers/golang.org/x/net/html

Given that pkg.go.dev does a much better job at tracking importers thanks to Go Modules and the Module Proxy, it would be nice if the community could get access to a public API similar to that of godoc.org.

@julieqiu julieqiu added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. FeatureRequest labels Jan 28, 2020
@julieqiu
Copy link
Member

@rhcarvalho - it would be helpful to get a sense of what your current use cases are for api.godoc.org, and feature requests are for an API for pkg.go.dev.

It sounds like getting the importers for a package is one of them. With pkg.go.dev being module aware, what specific information about importers would be useful to surface via an API?

For example:

  • importers for a specific version of a package
  • only importers of the latest version of a package
  • any importer for all versions of a package
  • something else?

Additionally, what other information would be useful to you to surface via an API?

/cc @tbpg who has also mentioned wanting an API for pkg.go.dev

@julieqiu julieqiu added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jan 29, 2020
@rhcarvalho
Copy link
Contributor Author

Without much thought, having an API to answer the more specific question "what are the importers of a specific version of a package" would make plausible to derive the answer to the other items in your list.

At the moment I consume the godoc.org API and scrape data from pkg.go.dev to answer the question "who uses my package". As far as I can tell the data in the "Imported By" go.dev tab is unrelated to the version of the package I'm currently browsing.

Here are the other endpoints in the GoDoc API: https://github.com/golang/gddo/blob/7365cb292b8bfd13dfe514a18b207b9cc70e6ceb/gddo-server/main.go#L901-L904

  • /search: takes a q parameter. Can be useful to explore known packages/modules.
  • /packages: attempts to return all known packages, no pagination. Doesn't seem too usable as is.
  • /importers/: useful, hard to compute without central knowledge.
  • /imports/: dispensable, easy to compute offline with go list.

So if we need a more specific request, here it is:

api.go.dev MVP

  • /importers/{module-or-package}/@v/<version>: returns the list of importers of a given module/package at a given version. URL scheme could be tuned to match the Go Proxy specification (go help goproxy).
  • /search/: matches the functionality of https://pkg.go.dev/search?q=hello, returns JSON instead.

@gopherbot gopherbot added this to the Unreleased milestone Feb 6, 2020
@julieqiu julieqiu removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Feb 29, 2020
@tbpg
Copy link
Contributor

tbpg commented Mar 18, 2020

Another endpoint idea (standalone or as part of a set of info returned for a module): available versions of a given module.

@myitcv
Copy link
Member

myitcv commented Mar 18, 2020

@tbpg

Another endpoint idea (standalone or as part of a set of info returned for a module): available versions of a given module.

Shouldn't people use cmd/go for that? Not least because of the fall-through semantics of the go env GOPROXY` variable. Noting #37367

@julieqiu
Copy link
Member

What information beyond https://proxy.golang.org/<module>/@v/list would be provided by that endpoint?

@tbpg
Copy link
Contributor

tbpg commented Mar 18, 2020

Shouldn't people use cmd/go for that? Not least because of the fall-through semantics of the go env GOPROXY` variable. Noting #37367

There are some cases where cmd/go might not be available and a normal web API would be helpful. I think it's reasonable to assume pkg.go.dev will only return the modules/versions it knows about.

What information beyond https://proxy.golang.org/<module>/@v/list would be provided by that endpoint?

None. That works. :)

@myitcv
Copy link
Member

myitcv commented Mar 18, 2020

There are some cases where cmd/go might not be available and a normal web API would be helpful

I'd be wary of encouraging tool authors to query pkg.go.dev/a proxy directly. Because it could well introduce skew compared to the answers from cmd/go.

@adamdecaf
Copy link
Contributor

adamdecaf commented Mar 18, 2020

In my use-case I don't want to execute cmd/go binaries as I just want to grab some metadata (i.e. detected license of all dependencies) from a dozen modules and organize it on a webpage. If I need local clones of those modules it really complicates this internal tool.

@myitcv
Copy link
Member

myitcv commented Apr 9, 2020

In #37952 I raised the question of whether module/package license file information could be surfaced in the output of cmd/go list. Following a conversation on yesterday's golang-tools call, we concluded that doing so would be a bad idea; reasons summarised in #37952 (comment).

@ianthehat instead suggested exposing the information via a pkg.go.dev API, leveraging the fact that the content and presentation on pkg.go.dev has already jumped through the relevant legal hoops.

This comment is therefore to explicitly request that we include license file information in the API. Thank you!

@tiegz
Copy link

tiegz commented Jun 5, 2020

👋 Thanks for all the new Go tools like pkg.go.dev, they're super useful. Some input on this topic from the perspective of Libraries.io:

  • +1 to a /search/ endpoint, or maybe just adding pagination to this url so it's possible to get a list of all packages: https://index.golang.org/index?since=TIME&limit=LIMIT. Many popular package repositories offer a full listing like this, e.g. https://packagist.org/packages/list.json for Packagist

  • What information beyond https://proxy.golang.org//@v/list would be provided by that endpoint?

Currently to get the publish times for all versions, you have to make N+1 requests: one for @v/list and then @v/xxx.info for each version and grab the "Time" value for each one. Assuming it's difficult to change the Go proxy spec, an API endpoint that returns all versions with metadata would be really nice.

  • As far as I can tell the data in the "Imported By" go.dev tab is unrelated to the version of the package I'm currently browsing.

Anyone know if there are plans to fix this?

@jba
Copy link
Contributor

jba commented Jun 8, 2020

  • As far as I can tell the data in the "Imported By" go.dev tab is unrelated to the version of the package I'm currently browsing.

Anyone know if there are plans to fix this?

No immediate plans. We currently gather that information from import statements in the code, so there is no version information attached. The go.mod file doesn't have all the version information we need.

So we understand it's an approximation and we want to fix it, but it's going to take some time.

@julieqiu julieqiu changed the title go.dev: API for pkg.go.dev x/pkgsite: API for pkg.go.dev Jun 15, 2020
@pombredanne
Copy link

@julieqiu my use for an API would be to:

  1. search and list packages
  2. for a package, access all the information as available for a given package in each of the tabs, such as here https://pkg.go.dev/github.com/gin-gonic/gin?tab=licenses ... basically everything that is available as HTML in the frontend https://github.com/golang/pkgsite/tree/master/internal/frontend should be available in a JSON api

@pombredanne
Copy link

@julieqiu oh and please do not retire http://api.godoc.org/packages unless there is an alternative!

@eclipseo
Copy link

eclipseo commented Aug 29, 2020

it would be helpful to get a sense of what your current use cases are for api.godoc.org, and feature requests are for an API for pkg.go.dev.

As a downstream package maintainer for Fedora, I'm also interested in an API. We have our own tool, Anitya, to track package releases, but it was not designed to track GIT commits. And many Go packages still don't publish version. So any information about latest published commit, with info like date of the commit, new dependencies, would be very helpful. I would gather data from the API in Python and compare it to the latest version we have for a given package.

I'm also interested in getting the License info, so we could find recursively all the licenses used in a static binary.

@pombredanne
Copy link

and BTW my general use case is for https://github.com/nexB/scancode-toolkit and related projects to provide my users with license, origin and dependencies details. And for https://github.com/nexb/vulnerablecode where we provide vulnerabilities details.

@ZiViZiViZ

This comment was marked as off-topic.

@srenatus
Copy link
Contributor

@ZiViZiViZ

This comment was marked as off-topic.

@tills13

This comment was marked as off-topic.

@ZiViZiViZ

This comment was marked as off-topic.

@Alphare

This comment was marked as off-topic.

@seankhliao
Copy link
Member

please keep this issue on topic and refer questions to
https://github.com/golang/go/wiki/Questions

@shellscape
Copy link

shellscape commented May 7, 2022

Rather than hiding what I'd consider a lot of really useful comments by folks (because I landed here for many of the same reasons they did, and there is extremely little available with regard to go package APIs) it would be infinitely more useful to explain to folks how to use the wiki for questions - or specify that tangential and related questions here are not welcome, and folks should seek the answers on the wiki. As it is now, that curt reply seems to imply that we should post questions on the wiki itself, which I doubt is what you're actually after.

The go team has been kicking this can for years, so users are naturally going to have a lot of questions. This reply #36785 (comment) specifically asks users what features they're after in an API - thus the issue is fair game for feature-related questions. Go packages are the very last major ecosystem package registry to lack a comprehensive API and that which doesn't make its data readily available. Because of this, people are naturally going to have a myriad of questions about trying to access that data. I'd humbly ask @seankhliao to be a bit more gracious in your moderation of this issue.

@golightlyb
Copy link
Contributor

  • Use case: generating static documentation, or running a local documentation server, but wanting to be able to query (with javascript, at the time of the page view by a web browser) the "[Imported by: NNN]" number that appears if instead viewed on pkg.go.dev.

  • currently api.godoc.org/importers is returning {"error":{"message":"Internal Server Error"}}

@rsc
Copy link
Contributor

rsc commented Aug 8, 2022

For what it's worth, we have been running api.godoc.org as an unmonitored service (meaning we don't page people for it), approximately "best effort" although even that may be too generous. We discovered today that it has been down for 30+ days because the disk filled. Given that being down for a month was a non-issue, we've decided to leave it down.

The godoc.org redirects will continue to run (approximately forever), but api.godoc.org will no longer be accessible. Or rather it will continue to be inaccessible.

@golightlyb
Copy link
Contributor

golightlyb commented Aug 8, 2022

@rsc fair enough if this is the decision but the previous comment is mine, right at the start of that 30+ day period, saying it's down, and three people agreeing.

If this is the decision, again, fine - but I submit that it shouldn't be based on people not complaining because to be fair I raised this pretty much as soon as it happened and it was never fixed and didn't have a response (fair enough- priorities) so anyone saying the same thing wouldn't be adding anything, and nobody knew they had a 30ish day deadline to object

(I don't intend for this to sound harsh, just as a disagreement! 🙂)

Edit: as an aside that might be helpful for people with an interest in this, Google's experimental Open Source Insights has an API. I wasn't aware of it until recently but it could very well resolve this entire conversation.

@shellscape

This comment was marked as off-topic.

@shellscape

This comment was marked as off-topic.

@gwatts
Copy link

gwatts commented Aug 9, 2022

I use the Dash doc viewer on my Mac which i find super useful for building a local index of all sorts of docs, Go included - Just found i couldn't install a new Go docset into it, presumably as it can no longer build an index from api.godoc.org.

Would be nice to point the author towards an alternate index if this one isn't returning as the usefulness of Dash is greatly diminished for me without having rapid access to the docs of Go packages i use frequently.

@AlekSi
Copy link
Contributor

AlekSi commented Aug 15, 2022

Dash 6.3.1 now relies on the web scraping. It works, but API would be better.

@hyangah
Copy link
Contributor

hyangah commented Apr 12, 2023

FYI deps.dev launched a new API service.

https://docs.deps.dev/api/v3alpha/

Its GetPackage, GetVersion, GetDependencies may be useful for users who need the list of versions, license info, and dependency info. I think it also provides vulnerability info as well. (cc @adg)

This does not cover the /search, /packages, /importers endpoints of the old gddo though.
(OTOH, it's unclear to me what people want to see from /packages endpoint given the current volume of data)

@avamsi
Copy link

avamsi commented Sep 27, 2023

I was looking for an API to search for Go "commands" -- https://deps.dev/_/search/suggest?system=go&kind=package&q=axl (and then https://api.deps.dev/v3alpha/systems/go/packages/github.com%2Favamsi%2Faxl) kinda works (I don't mind that deps.dev/_/search is hacky and potentially unstable), but pkg.go.dev also annotates the commands as such (see https://pkg.go.dev/search?q=axl, for example), so it's easy to tell them apart from vanilla libraries.

@sfblackl-intel
Copy link

sfblackl-intel commented Sep 27, 2023

https://docs.deps.dev/api/v3alpha/

We are using that...but realized that for Go packages in particular, there is no publish date...which led me to this thread. My main real ask is that publish date be included for Go packages...otherwise my solution appears to be to call something like this for each package/version (https://proxy.golang.org/<module>/@v/v1.9.0.info).

dvob added a commit to dvob/go-project-usage that referenced this issue May 4, 2024
Rollback to previous scrape strategy as api.godoc.org is no longer
available. See:
golang/go#36785 (comment)
@dvob
Copy link

dvob commented May 4, 2024

I also used api.godoc.org for my project https://github.com/dvob/go-project-usage/
To obtain the importers I now scrape https://pkg.go.dev/%s?tab=importedby (see https://github.com/dvob/go-project-usage/blob/72c5625c361b7e81fa2d1a35591c30830dc1164f/main.go#L134).
Unfortunately this is prone to break (e.g if CSS class name changes) and for packages with many importers it only returns 20k packages.

@xnuinside
Copy link

xnuinside commented Jul 24, 2024

I want to woke up this tread. https://docs.deps.dev/ that is referencing before does not contain any information about standard libraries packages (https://pkg.go.dev/std). Answering on why and who will use API: all researchers & data miners that needed to analyze open source packages (huge set of tools for code analyzes, code efficiency & vulnerabilities) (and it is only from my work-domain, I believe that there are more purposes) - we need information about package when we resolve vulnerabilities, when we investigate how languages growth, that dynamic on them, we need this information to track new releases & fixes & etc. Mostly all popular ecosystems for languages (maven, pypi, crates, conan, conda, etc) & operation systems like linux provide API to get this information - it could be DB dump, it could be REST API, github repo, anything, it could be huge csv files - any format will be better when scraping website.

UPD: also, there is no ANY informations about packages (subpackages?) inside repo, for example, https://pkg.go.dev/vuln/GO-2021-0073 for Affected Package - github.com/git-lfs/git-lfs/lfsapi on deps.dev info only about 'root' packages like github.com/git-lfs/git-lfs and you cannot find any information that packages inside

@mark-pictor-csec
Copy link

Another use for the API would be to check if any versions of a module are retracted. Right now the only way I see to get that info is pretty terrible - I must GET pkgs.go.dev/path/to/pkg/?tab=versions, and search the html for the word retracted. If it's there then I get to parse the html and figure out which version(s) are so marked.

People at my company are very thankful that go vulnerabilities usually include affected symbols (example), as it allows us to report to customers whether they're actually using a vulnerable function with far better accuracy.

So while vuln reports are much better in go, for retractions we must parse a web page and hope the html doesn't change. This feels like a real step back.

I hope an API can be prioritized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FeatureRequest NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. pkgsite
Projects
None yet
Development

No branches or pull requests