Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools: create diagnose_tensorboard #2237

Merged
merged 6 commits into from
May 16, 2019
Merged

tools: create diagnose_tensorboard #2237

merged 6 commits into from
May 16, 2019

Conversation

wchargin
Copy link
Contributor

@wchargin wchargin commented May 16, 2019

Summary:
Users often report problems that depend on environment-specific
configuration. Rather than asking them to find all this information and
enter it into an issue template manually, we can ask them to run a
script and paste its output verbatim into the issue. Furthermore, we can
detect and suggest fixes to common problems, such as #1907 and #2010.
The script can grow as we see fit to add new diagnoses and suggestions.

Test Plan:
The script is designed to be robust to errors in each individual
diagnosis: in the worst case, it prints a stack trace and continues to
the next one. I’ve manually tested that the main framework works in
Python 2 and 3 on Linux and in Python 3 on Windows.

Automated testing of this script is possible, but would take a fair
number of CPU cycles to run tests (setting up virtualenvs and Conda,
installing and importing TensorFlow many times). Given that this script
is never a production dependency and is explicitly designed to be run in
a discussion context, light testing seems reasonable.

To simulate a bad hostname, add

socket.getfqdn = lambda: b"\xc9".decode("utf-8")

to the top of the script.

To simulate a bad .tensorboard-info directory and test the quoting
behavior, run

export TMPDIR="$(mktemp -d)/uh oh/" &&
mkdir -p "${TMPDIR}/.tensorboard-info" &&
chmod 000 "${TMPDIR}/.tensorboard-info" &&
python ./tensorboard/tools/diagnose_tensorboard.py

To cross-check the autoidentification logic:

$ python tensorboard/tools/diagnose_tensorboard.py |
> awk '/version / { print $NF }'
e093841ffaea564cb2410e0b430bd0c552ada208
$ git hash-object tensorboard/tools/diagnose_tensorboard.py
e093841ffaea564cb2410e0b430bd0c552ada208
$ git rev-parse HEAD:tensorboard/tools/diagnose_tensorboard.py
e093841ffaea564cb2410e0b430bd0c552ada208
$ git cat-file blob e093841ffaea564cb2410e0b430bd0c552ada208 |
> diff -u tensorboard/tools/diagnose_tensorboard.py - | wc -l
0

wchargin-branch: diagnose-me

Summary:
Users often report problems that depend on environment-specific
configuration. Rather than asking them to find all this information and
enter it into an issue template manually, we can ask them to run a
script and paste its output verbatim into the issue. Furthermore, we can
detect and suggest fixes to common problems, such as #1907 and #2010.
The script can grow as we see fit to add new diagnoses and suggestions.

Test Plan:
Open to suggestions about how much automated testing there should be.
The structure of the script makes it robust to errors in each individual
diagnosis (in the worst case, it prints a stack trace and continues to
the next one), and I tested that the main framework works in Python 2
and 3 on Linux and in Python 3 on Windows.

To simulate a bad hostname, add

```python
socket.getfqdn = lambda: b"\xc9".decode("utf-8")
```

to the top of the script.

To simulate a bad `.tensorboard-info` directory and test the quoting
behavior, run

    export TMPDIR="$(mktemp -d)/uh oh/" &&
    mkdir -p "${TMPDIR}/.tensorboard-info" &&
    chmod 000 "${TMPDIR}/.tensorboard-info" &&
    python ./tensorboard/tools/diagnose_me.py

wchargin-branch: diagnose-me
@wchargin wchargin requested a review from stephanwlee May 16, 2019 00:58
wchargin-branch: diagnose-me
Cross-check the autoidentification logic:

```
$ python tensorboard/tools/diagnose_me.py |
> awk '/diagnose_me.py version/ { print $NF }'
c6fef91948fea619a6b919ddc013b516e7ca0ef4
$ git hash-object tensorboard/tools/diagnose_me.py
c6fef91948fea619a6b919ddc013b516e7ca0ef4
$ git rev-parse HEAD:tensorboard/tools/diagnose_me.py
c6fef91948fea619a6b919ddc013b516e7ca0ef4
$ git cat-file blob c6fef91948fea619a6b919ddc013b516e7ca0ef4 |
> diff -u tensorboard/tools/diagnose_me.py - | wc -l
0
```

wchargin-branch: diagnose-me
@manivaradarajan
Copy link
Member

Yay, thanks for starting this.



def check(fn):
"""Register a function as a check.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a line that this is intended to be used as a decorator, i.e., as @check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Copy link
Member

@manivaradarajan manivaradarajan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming to "diagnose_tensorboard"?

Copy link
Contributor

@stephanwlee stephanwlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks a lot for doing this!

--- check: autoidentify
INFO: diagnose_me.py version c6fef91948fea619a6b919ddc013b516e7ca0ef4
--- check: general
INFO: sys.version_info: sys.version_info(major=2, minor=7, micro=10, releaselevel='final', serial=0)
INFO: os.name: posix
INFO: os.uname(): ('Darwin', 'stephanlee-macbookpro2.roam.corp.google.com', '18.5.0', 'Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64', 'x86_64')
INFO: sys.getwindowsversion(): N/A
--- check: package_management
INFO: has conda-meta: False
INFO: $VIRTUAL_ENV: '/Users/stephanlee/venv/temp2'
--- check: installed_packages
INFO: installed: tb-nightly==1.14.0a20190516
INFO: installed: tf-nightly==1.14.1.dev20190516
INFO: installed: tf-estimator-nightly==1.14.0.dev2019051601
--- check: tensorboard_python_version
INFO: tensorboard.version.VERSION: '1.14.0a0'
--- check: tensorflow_python_version
WARNING: Limited tf.compat.v2.summary API due to missing TensorBoard installation.
INFO: tensorflow.__version__: '1.14.1-dev20190516'
INFO: tensorflow.__git_version__: 'v1.12.1-2079-g8bf90c7b1e'
--- check: tensorboard_binary_path
INFO: which tensorboard: '/Users/stephanlee/venv/temp2/bin/tensorboard\n'

Would you know why which tensorboard has a trailing \n?

print(markdown_code_fence)
suggestions = []
for check in CHECKS:
print("--- check: %s" % check.__name__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider putting one newline after to give some space between the checks (alternatively, we can use the color but that complicates things I guess).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, if you prefer. (I like the concise form, but don’t feel strongly.)
I did consider color, but decided against it—color is easy on xterm
emulators, but this script needs to work on Windows, too.

Copy link
Contributor Author

@wchargin wchargin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you know why which tensorboard has a trailing \n?

Commands that print text typically end with a newline:

$ whoami | xxd
00000000: 7763 6861 7267 696e 0a                   wchargin.
$ date -I | xxd
00000000: 3230 3139 2d30 352d 3136 0a              2019-05-16.
$ echo hi | xxd
00000000: 6869 0a                                  hi.

Commands that don’t output a trailing newline shunt the following prompt
to the right and don’t work well with standard text processing tools:

$ printf 'one line\n' | wc -l
1
$ printf 'no lines???' | wc -l
0
$ printf 'look ma, no newline'
look ma, no newline$ printf 'I am filled with regret'

print(markdown_code_fence)
suggestions = []
for check in CHECKS:
print("--- check: %s" % check.__name__)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, if you prefer. (I like the concise form, but don’t feel strongly.)
I did consider color, but decided against it—color is easy on xterm
emulators, but this script needs to work on Windows, too.



def check(fn):
"""Register a function as a check.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

wchargin-branch: diagnose-me
wchargin-branch: diagnose-me
wchargin-branch: diagnose-me
@wchargin wchargin changed the title tools: create diagnose_me tools: create diagnose_tensorboard May 16, 2019
@wchargin wchargin merged commit a1bbb2a into master May 16, 2019
@wchargin wchargin deleted the wchargin-diagnose-me branch May 16, 2019 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants