Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

warn if both tensorboard and tb-nightly pip packages are installed #1907

Open
nfelt opened this issue Feb 26, 2019 · 7 comments
Open

warn if both tensorboard and tb-nightly pip packages are installed #1907

nfelt opened this issue Feb 26, 2019 · 7 comments
Labels
theme:usability Areas to reduce confusion and frustration. type:build/install

Comments

@nfelt
Copy link
Contributor

nfelt commented Feb 26, 2019

This is a common source of inscrutable errors, because pip will happily install both pip packages even though they both ship the tensorboard python package [1], and in doing so the second one will overwrite any files from the first one: pypa/pip#4625. This means you can easily end up with a garbled frankentensorboard package (and it gets even worse if you uninstall one of them but not both of them, since then you get a frankentensorboard with a lot of vital organs removed).

Until pip fixes this mess, it's a fact of life, but we can at least try to warn users about this pitfall ourselves.

[1] It's colloquial to call both of these packages, so I'm prefixing to avoid confusion. See also pypa/packaging.python.org#106

@nfelt
Copy link
Contributor Author

nfelt commented Feb 26, 2019

For an example of this issue happening, see #1862

@nfelt
Copy link
Contributor Author

nfelt commented Feb 26, 2019

We could warn about this at runtime if we used the pkg_resources module from setuptools to check active distributions (aka pip packages). I think this would be something like pkg_resources.find(pkg_resources.Requirement("tensorboard")).

Even better would be to do this check at installation time. I think this may be possible via the cmdclass parameter to setup() by passing in a custom install command, see:
https://stackoverflow.com/questions/20288711/post-install-script-with-python-setuptools
https://blog.niteo.co/setuptools-run-custom-code-in-setup-py/

Either way we'd want to check reasonably thoroughly to make sure we don't have false positives. It's also TBD whether to just log a warning or whether to fail entirely. I'm inclined towards the latter but it might be good to start out with the former just in case there are any false positive installation scenarios out there. (We could always offer some kind of escape hatch if you get a false positive, like setting an environment variable that disables the check.)

@nfelt
Copy link
Contributor Author

nfelt commented Apr 5, 2019

Another example of this happening: #2092

@wchargin
Copy link
Contributor

wchargin commented Apr 5, 2019

Note that you get in this state if you upgrade to the TensorFlow 2.0
alpha using the recommended incantation from the Dev Summit talk:

$ virtualenv -q ./ve
$ . ./ve/bin/activate
(ve) $ pip install tensorflow==1.13.1 >/dev/null 2>&1
(ve) $ pip install -U --pre tensorflow >/dev/null 2>&1
(ve) $ pip freeze | grep -e tensor -e tf- -e tb-
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.
tb-nightly==1.14.0a20190301
tensorboard==1.13.1
tensorflow==2.0.0a0
tensorflow-estimator==1.13.0
tf-estimator-nightly==1.14.0.dev2019030115

@nfelt
Copy link
Contributor Author

nfelt commented Apr 5, 2019

At some point in the lead-up to the 2.0 final release we'll want to release a TensorBoard 2.0 alpha under the tensorboard package name, which will resolve that, but it's a good point that at least for now it's easy for users to arrive in this state by following our own instructions.

It would really be ideal if we could auto-uninstall tensorboard when installing tb-nightly and vice versa, but I think even if we did that as an install command (which is already a little risky/surprising) it might not work if the files are already copied over. The only way I know of that might reliably work (at least for sufficiently recent versions of pip) is to publish 0.0 versions of both tensorboard and tb-nightly that contain 0 actual python files, and then have each real package depend on the other with a pin to the 0.0 version, but this is an ugly hack.

@janosh
Copy link

janosh commented Apr 11, 2019

...but it's a good point that at least for now it's easy for users to arrive in this state by following our own instructions.

Exactly. This happened to me. Didn't realize for quite some time that I had an old TB installation lying around.

wchargin added a commit that referenced this issue May 7, 2019
Summary:
The smoke test doesn’t actually catch those things that it’s supposed to
catch, because it installs `tf-nightly` (or `tf-nightly-2.0-preview`),
which in turn installs `tb-nightly`. We then install the just-built
version of `tensorboard`, so the resulting virtualenv has both
`tensorboard` and `tb-nightly`: #1907 strikes again.

When `--tf-version ""` is passed, this is not an issue, but some smoke
checks only run when that is not passed.

It just so happens that this is currently benign: each of the direct
dependencies of `:build_pip_package` either has no smoke test, contains
an `__init__.py` file that is required to overwrite the Bazel-generated
empty one, because the smoke test references symbols directly in that
file, or is not yet in `tb-nightly`. But (a) starting with tomorrow’s
nightly, the mesh smoke test will meet none of those conditions, so its
smoke test will cease to be effective, and (b) I’d like to add an
hparams smoke test, which already falls into the same boat.

Test Plan:
Add an hparams smoke test, but remove the actual hparams dep:

```diff
diff --git a/tensorboard/pip_package/BUILD b/tensorboard/pip_package/BUILD
index c11e22e2..e63b790a 100644
--- a/tensorboard/pip_package/BUILD
+++ b/tensorboard/pip_package/BUILD
@@ -21,7 +21,6 @@ sh_binary(
         "//tensorboard:lib",  # User-facing overall TensorBoard API
         "//tensorboard:version",  # Version module (read by setup.py)
         "//tensorboard/plugins/beholder",  # User-facing beholder API
-        "//tensorboard/plugins/hparams",  # User-facing hparams API
         "//tensorboard/plugins/mesh:summary",  #  User-facing mesh summary API
         "//tensorboard/plugins/projector",  # User-facing projector API
         "//tensorboard/summary:tf_summary",  #  tf.summary API for TF 2.0
diff --git a/tensorboard/pip_package/build_pip_package.sh b/tensorboard/pip_package/build_pip_package.sh
index de778196..113724ef 100755
--- a/tensorboard/pip_package/build_pip_package.sh
+++ b/tensorboard/pip_package/build_pip_package.sh
@@ -172,6 +172,7 @@ import tensorboard as tb
 tb.summary.scalar_pb('test', 42)
 from tensorboard.plugins.beholder import Beholder, BeholderHook
 from tensorboard.plugins.mesh import summary
+from tensorboard.plugins.hparams.api import hparams_pb, KerasCallback
 "
   fi

```

Then run

    bazel run //tensorboard/pip_package:build_pip_package

and note that the smoke test passes (incorrectly) before this commit and
fails (correctly) after it.

Note also that the `pip freeze` output from the smoke test includes both
`tb-nightly` and `tensorboard` before this change (e.g., on recent
Travis logs), but only `tensorboard` after it.

wchargin-branch: pip-smoke-2dep
wchargin added a commit that referenced this issue May 15, 2019
Summary:
The smoke test doesn’t actually catch those things that it’s supposed to
catch, because it installs `tf-nightly` (or `tf-nightly-2.0-preview`),
which in turn installs `tb-nightly`. We then install the just-built
version of `tensorboard`, so the resulting virtualenv has both
`tensorboard` and `tb-nightly`: #1907 strikes again.

When `--tf-version ""` is passed, this is not an issue, but some smoke
checks only run when that is not passed.

It just so happens that this is currently benign: each of the direct
dependencies of `:build_pip_package` either has no smoke test, contains
an `__init__.py` file that is required to overwrite the Bazel-generated
empty one, because the smoke test references symbols directly in that
file, or is not yet in `tb-nightly`. But (a) starting with tomorrow’s
nightly, the mesh smoke test will meet none of those conditions, so its
smoke test will cease to be effective, and (b) I’d like to add an
hparams smoke test, which already falls into the same boat.

Test Plan:
Add an hparams smoke test, but remove the actual hparams dep:

```diff
diff --git a/tensorboard/pip_package/BUILD b/tensorboard/pip_package/BUILD
index c11e22e2..e63b790a 100644
--- a/tensorboard/pip_package/BUILD
+++ b/tensorboard/pip_package/BUILD
@@ -21,7 +21,6 @@ sh_binary(
         "//tensorboard:lib",  # User-facing overall TensorBoard API
         "//tensorboard:version",  # Version module (read by setup.py)
         "//tensorboard/plugins/beholder",  # User-facing beholder API
-        "//tensorboard/plugins/hparams",  # User-facing hparams API
         "//tensorboard/plugins/mesh:summary",  #  User-facing mesh summary API
         "//tensorboard/plugins/projector",  # User-facing projector API
         "//tensorboard/summary:tf_summary",  #  tf.summary API for TF 2.0
diff --git a/tensorboard/pip_package/build_pip_package.sh b/tensorboard/pip_package/build_pip_package.sh
index de778196..113724ef 100755
--- a/tensorboard/pip_package/build_pip_package.sh
+++ b/tensorboard/pip_package/build_pip_package.sh
@@ -172,6 +172,7 @@ import tensorboard as tb
 tb.summary.scalar_pb('test', 42)
 from tensorboard.plugins.beholder import Beholder, BeholderHook
 from tensorboard.plugins.mesh import summary
+from tensorboard.plugins.hparams.api import hparams_pb, KerasCallback
 "
   fi

```

Then run

    bazel run //tensorboard/pip_package:build_pip_package

and note that the smoke test passes (incorrectly) before this commit and
fails (correctly) after it.

Note also that the `pip freeze` output from the smoke test includes both
`tb-nightly` and `tensorboard` before this change (e.g., on recent
Travis logs), but only `tensorboard` after it.

wchargin-branch: pip-smoke-2dep
wchargin added a commit that referenced this issue May 16, 2019
Summary:
Users often report problems that depend on environment-specific
configuration. Rather than asking them to find all this information and
enter it into an issue template manually, we can ask them to run a
script and paste its output verbatim into the issue. Furthermore, we can
detect and suggest fixes to common problems, such as #1907 and #2010.
The script can grow as we see fit to add new diagnoses and suggestions.

Test Plan:
Open to suggestions about how much automated testing there should be.
The structure of the script makes it robust to errors in each individual
diagnosis (in the worst case, it prints a stack trace and continues to
the next one), and I tested that the main framework works in Python 2
and 3 on Linux and in Python 3 on Windows.

To simulate a bad hostname, add

```python
socket.getfqdn = lambda: b"\xc9".decode("utf-8")
```

to the top of the script.

To simulate a bad `.tensorboard-info` directory and test the quoting
behavior, run

    export TMPDIR="$(mktemp -d)/uh oh/" &&
    mkdir -p "${TMPDIR}/.tensorboard-info" &&
    chmod 000 "${TMPDIR}/.tensorboard-info" &&
    python ./tensorboard/tools/diagnose_me.py

wchargin-branch: diagnose-me
wchargin added a commit that referenced this issue May 16, 2019
Summary:
Users often report problems that depend on environment-specific
configuration. Rather than asking them to find all this information and
enter it into an issue template manually, we can ask them to run a
script and paste its output verbatim into the issue. Furthermore, we can
detect and suggest fixes to common problems, such as #1907 and #2010.
The script can grow as we see fit to add new diagnoses and suggestions.

Test Plan:
The script is designed to be robust to errors in each individual
diagnosis: in the worst case, it prints a stack trace and continues to
the next one. I’ve manually tested that the main framework works in
Python 2 and 3 on Linux and in Python 3 on Windows.

Automated testing of this script is possible, but would take a fair
number of CPU cycles to run tests (setting up virtualenvs and Conda,
installing and importing TensorFlow many times). Given that this script
is never a production dependency and is explicitly designed to be run in
a discussion context, light testing seems reasonable.

To simulate a bad hostname, add

```python
socket.getfqdn = lambda: b"\xc9".decode("utf-8")
```

to the top of the script.

To simulate a bad `.tensorboard-info` directory and test the quoting
behavior, run

```shell
export TMPDIR="$(mktemp -d)/uh oh/" &&
mkdir -p "${TMPDIR}/.tensorboard-info" &&
chmod 000 "${TMPDIR}/.tensorboard-info" &&
python ./tensorboard/tools/diagnose_tensorboard.py
```

To cross-check the autoidentification logic:

```
$ python tensorboard/tools/diagnose_tensorboard.py |
> awk '/version / { print $NF }'
e093841ffaea564cb2410e0b430bd0c552ada208
$ git hash-object tensorboard/tools/diagnose_tensorboard.py
e093841ffaea564cb2410e0b430bd0c552ada208
$ git rev-parse HEAD:tensorboard/tools/diagnose_tensorboard.py
e093841ffaea564cb2410e0b430bd0c552ada208
$ git cat-file blob e093841ffaea564cb2410e0b430bd0c552ada208 |
> diff -u tensorboard/tools/diagnose_tensorboard.py - | wc -l
0
```

wchargin-branch: diagnose-me
@wchargin
Copy link
Contributor

Quick update: TensorBoard itself doesn’t warn you about this failure
mode, but our self-diagnosis script does:
https://raw.githubusercontent.com/tensorflow/tensorboard/master/tensorboard/tools/diagnose_tensorboard.py

If you run it, you’ll get a message like:

WARNING: conflicting installations: ['tb-nightly', 'tensorboard']

Suggestion: Fix conflicting installations

Conflicting package installations found. Depending on the order of
installations and uninstallations, behavior may be undefined. Please
uninstall ALL versions of TensorFlow and TensorBoard, then reinstall
ONLY the desired version of TensorFlow, which will transitively pull
in the proper version of TensorBoard. (If you use TensorBoard without
TensorFlow, just reinstall the appropriate version of TensorBoard
directly.)

Namely:

pip uninstall tb-nightly tensorboard
pip install tensorflow  # or `tensorflow-gpu`, or `tf-nightly`, ...

It would still be nice to do this at runtime, too, as described in this
issue.

@nfelt nfelt added the theme:usability Areas to reduce confusion and frustration. label Dec 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme:usability Areas to reduce confusion and frustration. type:build/install
Projects
None yet
Development

No branches or pull requests

3 participants