-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add nap for telemetry #226
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Czaki this is looking great, thank you for all your work on this 🎉 ❤️ ! I've done a pass of suggestions for wording etc. Once those updates are made and I've made sure the formatting is correct, I'll be happy to approve the draft.
In terms of comments on the proposal itself these are my general thoughts:
- I would like to see us include in the NAP a process for getting community feedback and concerns before implementation. While the NAP is usually the way to publicize big changes like this, I think they're mostly watched/discussed by core devs. Since telemetry can be a controversial topic, I would like us to include a process here that ensures the community is aware of and on board with the proposal
- I think before accepting the NAP, I would like to see a complete list (as complete as possible) of exactly what is collected with each level of telemetry, and potentially also a sample report, and what would show up on the public dashboards
- We mention collecting information on what plugin contributions are used, but I think it would be great to collect information on what internal commands are used as well. It would give us great insight into popular features, potentially underused/underpublicized features and "danger zones" for API deprecations. Not sure to what extent this is possible, but once the app-model conversions are done there should be a few things we can learn?
- I would like us to expand a little on the motivation for telemetry beyond "gauging user numbers" which imo is the weakest motivation. For me it's really more about learning what plugins and parts of the codebase are seeing heavy usage, predicting the impact of changing APIs and gaining insight into what could be deprecated, changed, better documented, etc.
Co-authored-by: Draga Doncila Pop <[email protected]>
Co-authored-by: Draga Doncila Pop <[email protected]>
This is mentioned in the motivation and scope paragraph. Should I rephase it?
great idea. I will add this in the following days. I also agree with polishing the list of collected data. Opening this PR is way to open it on the community. I think, that after some polishing we should make a public announcement. |
docs/naps/8-telemetry.md
Outdated
1. Disable in settings | ||
2. uninstall `napari-telemetry` package | ||
3. Environment variable `NAPARI_TELEMETRY=0` | ||
4. Full list of endpoints used for collecting telemetry, that could be filtered on the firewall level. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4. Full list of endpoints used for collecting telemetry, that could be filtered on the firewall level. | |
5 . Disable at system-wide level in a way that is not user-controlable (Hpc/ other deployments). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean to add new point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly we should merge the two sentences:
- System-wide disablement e.g. via firewall filtering for hpc or other environments
docs/naps/8-telemetry.md
Outdated
|
||
Also collecting information about data types and their size will provide valuable information about the typical use cases of napari. | ||
|
||
Still, users need to be able to opt out of such monitoring, and adjust the level of detail of the information that is sent to the napari server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Detail, I would store the date when someone agreed, and maybe reconfirm every few month or so.
You might need a way for the remote to reply with a specific error code/version number that void the user consent, and retrigger asking consent as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you prefer longer storage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the proposal is that we reconfirm opt-in every few months or so. I think this is a good idea and worth adding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still do not follow.
Did we want to reconfirm every three months (for example), or when we update the telemetry code, so change the amount of collected information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be min(3_months, package_update)
for re-confirming, personally.
The dialog for whether you want to continue sending telemetry could show an example of a recent report that had been sent, so the user can gauge their participation based on real data. We could always add a Don't ask me again
option as well of course.
|
||
Another option is to scan public plugins and their dependencies. This is simpler but will require establishing additional communication channels to be able to warn users about the potential problem. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will need to describe the way people can contact napari for their data to be removed.
You have to be careful, login IP in the machine that collect logs may be considered collecting PII in some regions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My Idea is to at the first phase, not to log IP. I do not see the benefit of such information and only risk.
I ha e added GDPR section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I was unclear, if for example you run the collection service behind nginx, the nginx logs may store IP. So you have to be careful that not only the service you write but part of the stack does not store data and/or that you can delete it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some storage is required (Identifying spamming users), but it may be short with fast removal.
And such things will not be propagated to the aggregator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Czaki I have done a final pass and once that's done I think this should go in in draft mode. Once we have some sample reports to look at, we should be able to add any other necessary detail. It would be nice to have a basic architecture diagram as well.
docs/naps/8-telemetry.md
Outdated
1. Disable in settings | ||
2. uninstall `napari-telemetry` package | ||
3. Environment variable `NAPARI_TELEMETRY=0` | ||
4. Full list of endpoints used for collecting telemetry, that could be filtered on the firewall level. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly we should merge the two sentences:
- System-wide disablement e.g. via firewall filtering for hpc or other environments
docs/naps/8-telemetry.md
Outdated
|
||
Also collecting information about data types and their size will provide valuable information about the typical use cases of napari. | ||
|
||
Still, users need to be able to opt out of such monitoring, and adjust the level of detail of the information that is sent to the napari server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the proposal is that we reconfirm opt-in every few months or so. I think this is a good idea and worth adding.
Co-authored-by: Draga Doncila Pop <[email protected]>
dependency check is failing because of bug in main repo gregsdennis/dependencies-action#24 |
docs/naps/8-telemetry.md
Outdated
|
||
```{eval-rst} | ||
:Author: Grzegorz Bokota | ||
:Created: <date created on, in yyyy-mm-dd format> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Czaki you should update this, I think it's fine to date it as the day you opened the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Czaki I think this is fine to go in whenever in draft mode. Thanks for all your work! Any further updates can be made in future PRs as is standard for NAPs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, let's get this in and iterate in a PR.
Description
Add NAP with the proposition of a telemetry system.
Type of change