-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tools: create diagnose_tensorboard
#2237
Conversation
Summary: Users often report problems that depend on environment-specific configuration. Rather than asking them to find all this information and enter it into an issue template manually, we can ask them to run a script and paste its output verbatim into the issue. Furthermore, we can detect and suggest fixes to common problems, such as #1907 and #2010. The script can grow as we see fit to add new diagnoses and suggestions. Test Plan: Open to suggestions about how much automated testing there should be. The structure of the script makes it robust to errors in each individual diagnosis (in the worst case, it prints a stack trace and continues to the next one), and I tested that the main framework works in Python 2 and 3 on Linux and in Python 3 on Windows. To simulate a bad hostname, add ```python socket.getfqdn = lambda: b"\xc9".decode("utf-8") ``` to the top of the script. To simulate a bad `.tensorboard-info` directory and test the quoting behavior, run export TMPDIR="$(mktemp -d)/uh oh/" && mkdir -p "${TMPDIR}/.tensorboard-info" && chmod 000 "${TMPDIR}/.tensorboard-info" && python ./tensorboard/tools/diagnose_me.py wchargin-branch: diagnose-me
wchargin-branch: diagnose-me
Cross-check the autoidentification logic: ``` $ python tensorboard/tools/diagnose_me.py | > awk '/diagnose_me.py version/ { print $NF }' c6fef91948fea619a6b919ddc013b516e7ca0ef4 $ git hash-object tensorboard/tools/diagnose_me.py c6fef91948fea619a6b919ddc013b516e7ca0ef4 $ git rev-parse HEAD:tensorboard/tools/diagnose_me.py c6fef91948fea619a6b919ddc013b516e7ca0ef4 $ git cat-file blob c6fef91948fea619a6b919ddc013b516e7ca0ef4 | > diff -u tensorboard/tools/diagnose_me.py - | wc -l 0 ``` wchargin-branch: diagnose-me
Yay, thanks for starting this. |
tensorboard/tools/diagnose_me.py
Outdated
|
||
|
||
def check(fn): | ||
"""Register a function as a check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a line that this is intended to be used as a decorator, i.e., as @check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider renaming to "diagnose_tensorboard"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Thanks a lot for doing this!
--- check: autoidentify
INFO: diagnose_me.py version c6fef91948fea619a6b919ddc013b516e7ca0ef4
--- check: general
INFO: sys.version_info: sys.version_info(major=2, minor=7, micro=10, releaselevel='final', serial=0)
INFO: os.name: posix
INFO: os.uname(): ('Darwin', 'stephanlee-macbookpro2.roam.corp.google.com', '18.5.0', 'Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64', 'x86_64')
INFO: sys.getwindowsversion(): N/A
--- check: package_management
INFO: has conda-meta: False
INFO: $VIRTUAL_ENV: '/Users/stephanlee/venv/temp2'
--- check: installed_packages
INFO: installed: tb-nightly==1.14.0a20190516
INFO: installed: tf-nightly==1.14.1.dev20190516
INFO: installed: tf-estimator-nightly==1.14.0.dev2019051601
--- check: tensorboard_python_version
INFO: tensorboard.version.VERSION: '1.14.0a0'
--- check: tensorflow_python_version
WARNING: Limited tf.compat.v2.summary API due to missing TensorBoard installation.
INFO: tensorflow.__version__: '1.14.1-dev20190516'
INFO: tensorflow.__git_version__: 'v1.12.1-2079-g8bf90c7b1e'
--- check: tensorboard_binary_path
INFO: which tensorboard: '/Users/stephanlee/venv/temp2/bin/tensorboard\n'
Would you know why which tensorboard has a trailing \n
?
tensorboard/tools/diagnose_me.py
Outdated
print(markdown_code_fence) | ||
suggestions = [] | ||
for check in CHECKS: | ||
print("--- check: %s" % check.__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider putting one newline after to give some space between the checks (alternatively, we can use the color but that complicates things I guess).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, if you prefer. (I like the concise form, but don’t feel strongly.)
I did consider color, but decided against it—color is easy on xterm
emulators, but this script needs to work on Windows, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you know why which tensorboard has a trailing \n?
Commands that print text typically end with a newline:
$ whoami | xxd
00000000: 7763 6861 7267 696e 0a wchargin.
$ date -I | xxd
00000000: 3230 3139 2d30 352d 3136 0a 2019-05-16.
$ echo hi | xxd
00000000: 6869 0a hi.
Commands that don’t output a trailing newline shunt the following prompt
to the right and don’t work well with standard text processing tools:
$ printf 'one line\n' | wc -l
1
$ printf 'no lines???' | wc -l
0
$ printf 'look ma, no newline'
look ma, no newline$ printf 'I am filled with regret'
tensorboard/tools/diagnose_me.py
Outdated
print(markdown_code_fence) | ||
suggestions = [] | ||
for check in CHECKS: | ||
print("--- check: %s" % check.__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, if you prefer. (I like the concise form, but don’t feel strongly.)
I did consider color, but decided against it—color is easy on xterm
emulators, but this script needs to work on Windows, too.
tensorboard/tools/diagnose_me.py
Outdated
|
||
|
||
def check(fn): | ||
"""Register a function as a check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
wchargin-branch: diagnose-me
wchargin-branch: diagnose-me
wchargin-branch: diagnose-me
diagnose_me
diagnose_tensorboard
Summary:
Users often report problems that depend on environment-specific
configuration. Rather than asking them to find all this information and
enter it into an issue template manually, we can ask them to run a
script and paste its output verbatim into the issue. Furthermore, we can
detect and suggest fixes to common problems, such as #1907 and #2010.
The script can grow as we see fit to add new diagnoses and suggestions.
Test Plan:
The script is designed to be robust to errors in each individual
diagnosis: in the worst case, it prints a stack trace and continues to
the next one. I’ve manually tested that the main framework works in
Python 2 and 3 on Linux and in Python 3 on Windows.
Automated testing of this script is possible, but would take a fair
number of CPU cycles to run tests (setting up virtualenvs and Conda,
installing and importing TensorFlow many times). Given that this script
is never a production dependency and is explicitly designed to be run in
a discussion context, light testing seems reasonable.
To simulate a bad hostname, add
to the top of the script.
To simulate a bad
.tensorboard-info
directory and test the quotingbehavior, run
To cross-check the autoidentification logic:
wchargin-branch: diagnose-me