-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove experimental data selector #2290
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, wow, okay.
The code change looks good (modulo inline), and the intent is fine with
me, but I feel like we should run this by @nfelt, too—?
I'm ok with this if you think it's best to avoid the maintenance overhead. I do think it's valuable to be able to have this concrete functionality on hand as a working reference/demo when revisiting this area in the near future (whether or not we revive the literal code itself), but it's true that we can check out and build older revisions even if it's removed at head. Basically it's up to you. |
@nfelt yup, TB 1.13.x and probably older version should contain this. Btw, this PR is "WANT_LGTM=all" :) |
de15b88
to
805e934
Compare
DataSelector was an experimental feature, when using Database backed backend, allowed improved user experience in choosing data with built-in notion of experiments. The team has no plans to make tangible improvements to it and want to re-consider the data selection story more hollistically.
Summary: This implements a rudimentary experiment selection mechanism for the runs selector and scalars dashboard only, for the purposes of testing data provider implementations. The experiment ID is stored as a query parameter, which isn’t perfect because it precludes caching the big TensorBoard HTML blob across experiments. In the long term, we can consider refactoring to serve the big HTML blob from a static path that’s fetched by the page. The TensorBoard codebase still has some fragments of a previous experimental data selector, which was partially removed in #2290, and so parts of this diff look like changes but are functionally additions. Test Plan: Instrument the multiplexer data provider to log experiment IDs: ```diff diff --git a/tensorboard/backend/event_processing/data_provider.py b/tensorboard/backend/event_processing/data_provider.py index ef602320..4c72d096 100644 --- a/tensorboard/backend/event_processing/data_provider.py +++ b/tensorboard/backend/event_processing/data_provider.py @@ -54,7 +54,7 @@ class MultiplexerDataProvider(provider.DataProvider): return None def list_runs(self, experiment_id): - del experiment_id # ignored for now + logger.warn("Listing runs for experiment %r", experiment_id) return [ provider.Run( run_id=run, # use names as IDs @@ -65,7 +65,7 @@ class MultiplexerDataProvider(provider.DataProvider): ] def list_scalars(self, experiment_id, plugin_name, run_tag_filter=None): - del experiment_id # ignored for now + logger.warn("Listing scalars for experiment %r", experiment_id) run_tag_content = self._multiplexer.PluginRunToTagToContent(plugin_name) result = {} if run_tag_filter is None: @@ -96,6 +96,7 @@ class MultiplexerDataProvider(provider.DataProvider): def read_scalars( self, experiment_id, plugin_name, downsample=None, run_tag_filter=None ): + logger.warn("Reading scalars for experiment %r", experiment_id) # TODO(@wchargin): Downsampling not implemented, as the multiplexer # is already downsampled. We could downsample on top of the existing # sampling, which would be nice for testing. ``` Then launch TensorBoard with `--generic_data=true` and navigate to <http://localhost:6006/?experiment=foo>; verify that all three varieties of server logs are exclusively for the correct experiment ID. Then, remove the query parameter, and ensure that TensorBoard still works normally (with experiment ID the empty string). wchargin-branch: data-experiment
Summary: This implements a rudimentary experiment selection mechanism for the runs selector and scalars dashboard only, for the purposes of testing data provider implementations. The experiment ID is stored as a query parameter, which isn’t perfect because it precludes caching the big TensorBoard HTML blob across experiments. In the long term, we can consider refactoring to serve the big HTML blob from a static path that’s fetched by the page. The TensorBoard codebase still has some fragments of a previous experimental data selector, which was partially removed in #2290, and so parts of this diff look like changes but are functionally additions. Test Plan: Instrument the multiplexer data provider to log experiment IDs: ```diff diff --git a/tensorboard/backend/event_processing/data_provider.py b/tensorboard/backend/event_processing/data_provider.py index ef602320..4c72d096 100644 --- a/tensorboard/backend/event_processing/data_provider.py +++ b/tensorboard/backend/event_processing/data_provider.py @@ -54,7 +54,7 @@ class MultiplexerDataProvider(provider.DataProvider): return None def list_runs(self, experiment_id): - del experiment_id # ignored for now + logger.warn("Listing runs for experiment %r", experiment_id) return [ provider.Run( run_id=run, # use names as IDs @@ -65,7 +65,7 @@ class MultiplexerDataProvider(provider.DataProvider): ] def list_scalars(self, experiment_id, plugin_name, run_tag_filter=None): - del experiment_id # ignored for now + logger.warn("Listing scalars for experiment %r", experiment_id) run_tag_content = self._multiplexer.PluginRunToTagToContent(plugin_name) result = {} if run_tag_filter is None: @@ -96,6 +96,7 @@ class MultiplexerDataProvider(provider.DataProvider): def read_scalars( self, experiment_id, plugin_name, downsample=None, run_tag_filter=None ): + logger.warn("Reading scalars for experiment %r", experiment_id) # TODO(@wchargin): Downsampling not implemented, as the multiplexer # is already downsampled. We could downsample on top of the existing # sampling, which would be nice for testing. ``` Then launch TensorBoard with `--generic_data=true` and navigate to <http://localhost:6006/?experiment=foo>; verify that all three varieties of server logs are exclusively for the correct experiment ID. Then, remove the query parameter, and ensure that TensorBoard still works normally (with experiment ID the empty string). wchargin-branch: data-experiment
DataSelector was an experimental feature, when using Database backed
backend, allowed improved user experience in choosing data with built-in
notion of experiments.
The team has no plans to make tangible improvements to it and want to
re-consider the data selection story more holistically.
I, personally, am ok with maintaining this code iff it does not incur
maintenance burden but, especially, since we lack tests, that is not the
case today. If we want to revive it, we can always revert the change.
WANT_LGTM=all