Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove experimental data selector #2290

Merged
merged 4 commits into from
Jun 11, 2019
Merged

Conversation

stephanwlee
Copy link
Contributor

@stephanwlee stephanwlee commented May 30, 2019

DataSelector was an experimental feature, when using Database backed
backend, allowed improved user experience in choosing data with built-in
notion of experiments.

The team has no plans to make tangible improvements to it and want to
re-consider the data selection story more holistically.

I, personally, am ok with maintaining this code iff it does not incur
maintenance burden but, especially, since we lack tests, that is not the
case today. If we want to revive it, we can always revert the change.

WANT_LGTM=all

Copy link
Contributor

@wchargin wchargin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wow, okay.

The code change looks good (modulo inline), and the intent is fine with
me, but I feel like we should run this by @nfelt, too—?

@stephanwlee stephanwlee requested a review from nfelt May 30, 2019 03:34
@nfelt
Copy link
Contributor

nfelt commented May 31, 2019

I'm ok with this if you think it's best to avoid the maintenance overhead. I do think it's valuable to be able to have this concrete functionality on hand as a working reference/demo when revisiting this area in the near future (whether or not we revive the literal code itself), but it's true that we can check out and build older revisions even if it's removed at head.

Basically it's up to you.

@stephanwlee
Copy link
Contributor Author

stephanwlee commented May 31, 2019

it's true that we can check out and build older revisions even if it's removed at head.

@nfelt yup, TB 1.13.x and probably older version should contain this. Btw, this PR is "WANT_LGTM=all" :)

@stephanwlee stephanwlee force-pushed the gc branch 2 times, most recently from de15b88 to 805e934 Compare June 7, 2019 15:07
@stephanwlee stephanwlee requested review from manivaradarajan and removed request for nfelt June 7, 2019 15:12
DataSelector was an experimental feature, when using Database backed
backend, allowed improved user experience in choosing data with built-in
notion of experiments.

The team has no plans to make tangible improvements to it and want to
re-consider the data selection story more hollistically.
@stephanwlee stephanwlee merged commit b764306 into tensorflow:master Jun 11, 2019
@stephanwlee stephanwlee deleted the gc branch June 11, 2019 04:03
wchargin added a commit that referenced this pull request Aug 19, 2019
Summary:
This implements a rudimentary experiment selection mechanism for the
runs selector and scalars dashboard only, for the purposes of testing
data provider implementations. The experiment ID is stored as a query
parameter, which isn’t perfect because it precludes caching the big
TensorBoard HTML blob across experiments. In the long term, we can
consider refactoring to serve the big HTML blob from a static path
that’s fetched by the page.

The TensorBoard codebase still has some fragments of a previous
experimental data selector, which was partially removed in #2290, and so
parts of this diff look like changes but are functionally additions.

Test Plan:
Instrument the multiplexer data provider to log experiment IDs:

```diff
diff --git a/tensorboard/backend/event_processing/data_provider.py b/tensorboard/backend/event_processing/data_provider.py
index ef602320..4c72d096 100644
--- a/tensorboard/backend/event_processing/data_provider.py
+++ b/tensorboard/backend/event_processing/data_provider.py
@@ -54,7 +54,7 @@ class MultiplexerDataProvider(provider.DataProvider):
       return None

   def list_runs(self, experiment_id):
-    del experiment_id  # ignored for now
+    logger.warn("Listing runs for experiment %r", experiment_id)
     return [
         provider.Run(
             run_id=run,  # use names as IDs
@@ -65,7 +65,7 @@ class MultiplexerDataProvider(provider.DataProvider):
     ]

   def list_scalars(self, experiment_id, plugin_name, run_tag_filter=None):
-    del experiment_id  # ignored for now
+    logger.warn("Listing scalars for experiment %r", experiment_id)
     run_tag_content = self._multiplexer.PluginRunToTagToContent(plugin_name)
     result = {}
     if run_tag_filter is None:
@@ -96,6 +96,7 @@ class MultiplexerDataProvider(provider.DataProvider):
   def read_scalars(
       self, experiment_id, plugin_name, downsample=None, run_tag_filter=None
   ):
+    logger.warn("Reading scalars for experiment %r", experiment_id)
     # TODO(@wchargin): Downsampling not implemented, as the multiplexer
     # is already downsampled. We could downsample on top of the existing
     # sampling, which would be nice for testing.
```

Then launch TensorBoard with `--generic_data=true` and navigate to
<http://localhost:6006/?experiment=foo>; verify that all three varieties
of server logs are exclusively for the correct experiment ID.

Then, remove the query parameter, and ensure that TensorBoard still
works normally (with experiment ID the empty string).

wchargin-branch: data-experiment
wchargin added a commit that referenced this pull request Aug 22, 2019
Summary:
This implements a rudimentary experiment selection mechanism for the
runs selector and scalars dashboard only, for the purposes of testing
data provider implementations. The experiment ID is stored as a query
parameter, which isn’t perfect because it precludes caching the big
TensorBoard HTML blob across experiments. In the long term, we can
consider refactoring to serve the big HTML blob from a static path
that’s fetched by the page.

The TensorBoard codebase still has some fragments of a previous
experimental data selector, which was partially removed in #2290, and so
parts of this diff look like changes but are functionally additions.

Test Plan:
Instrument the multiplexer data provider to log experiment IDs:

```diff
diff --git a/tensorboard/backend/event_processing/data_provider.py b/tensorboard/backend/event_processing/data_provider.py
index ef602320..4c72d096 100644
--- a/tensorboard/backend/event_processing/data_provider.py
+++ b/tensorboard/backend/event_processing/data_provider.py
@@ -54,7 +54,7 @@ class MultiplexerDataProvider(provider.DataProvider):
       return None

   def list_runs(self, experiment_id):
-    del experiment_id  # ignored for now
+    logger.warn("Listing runs for experiment %r", experiment_id)
     return [
         provider.Run(
             run_id=run,  # use names as IDs
@@ -65,7 +65,7 @@ class MultiplexerDataProvider(provider.DataProvider):
     ]

   def list_scalars(self, experiment_id, plugin_name, run_tag_filter=None):
-    del experiment_id  # ignored for now
+    logger.warn("Listing scalars for experiment %r", experiment_id)
     run_tag_content = self._multiplexer.PluginRunToTagToContent(plugin_name)
     result = {}
     if run_tag_filter is None:
@@ -96,6 +96,7 @@ class MultiplexerDataProvider(provider.DataProvider):
   def read_scalars(
       self, experiment_id, plugin_name, downsample=None, run_tag_filter=None
   ):
+    logger.warn("Reading scalars for experiment %r", experiment_id)
     # TODO(@wchargin): Downsampling not implemented, as the multiplexer
     # is already downsampled. We could downsample on top of the existing
     # sampling, which would be nice for testing.
```

Then launch TensorBoard with `--generic_data=true` and navigate to
<http://localhost:6006/?experiment=foo>; verify that all three varieties
of server logs are exclusively for the correct experiment ID.

Then, remove the query parameter, and ensure that TensorBoard still
works normally (with experiment ID the empty string).

wchargin-branch: data-experiment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants