-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPEC 11: API telemetry #1
Comments
I'll drop a link here to popylar which is I think only giving stats on overall module usage (i.e., number of times imported), not granular detail about which classes or functions are called. But @arokem is going to be at the summit and might have some thoughts on more granular metrics, so probably worth cornering him for a chat about it! |
I will link to scientific-python/summit-2023#17, we might want to pull out notes from scipy about 10 years ago. See as well https://github.com/Carreau/consent_broker I started to work on some time ago, and a related discussion pyOpenSci/software-peer-review#183 |
Different but related: https://github.com/betatim/kamal - a tool you run locally over a code base to get statistics on what API of a particular library is being used. The idea is that people can run this themselves and report the stats they get to somewhere (central?). The stats that are collected are kind of easy to look at, the goal being that you can somewhat easily convince yourself that no unwanted information is being shared. The original use case I had in mind was organisations that have private code bases but want to help a project learn which parts of their API are being used. Another idea could be running this in the CI of your project and reporting back to some central place (e.g. scikit-learn and pandas run this in their CI and report back to matplotlib or numpy regarding which parts of their API they use). |
Summarizing today's discussions during the summit and dinner with @drammock @seberg @betatim @Carreau @stefanv et al:
|
great summary @guenp. Also discussed breifly was how to incentivize opt-in; some ideas were:
|
The idea about combining the tool with linting was initially brought up by @rossbar. :D |
Summary from chat with @crazy4pi314
|
Chat with @eriknw and @betatim :
|
Brainstorm and discuss a SPEC to establish how to add instrumentation and telemetry for scientific python projects to gain insights into usage patterns. Currently, Python projects typically don't have any direct insights into how users interact with their library, what common errors they run into or which APIs are most (in)frequently used. The goal of this SPEC is to design a way to collect usage logs from users in a transparent, ethical and efficient way, with minimal impact to user experience, in order to provide project maintainers with useful metrics and insights into the performance, usability and common user errors of components (modules, functions, etc) in their library.
The text was updated successfully, but these errors were encountered: