Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uploader: display upload statistics while uploading data #3678

Merged
merged 13 commits into from
Jun 4, 2020

Conversation

caisq
Copy link
Contributor

@caisq caisq commented May 28, 2020

  • Motivation for features / changes
    • This is the first of a series of PRs aimed at improving the usability of the tensorboard.dev uploader.
    • This PR specifically adds real-time display of upload statistics.
  • Technical description of changes
    • During uploading, the users sees 2 or 3 bars rendered with tqdm:
      1. [TIMESTAMP] Uploaded x scalars, y tensors (Y B), z binary objects (Z B)
      2. (if any data was skipped): Skipped n tensors (N B), m binary objects (M B)
      3. Uploading XYZ (this is the real-time status)
    • If the user wishes to see a less verbose UI, they can use --verbose 0 to override
      the default value 1
    • When the uploading is idling, it displays "Listening for new data in logdir..."
    • Add a module upload_tracker.py to encapsulate the logic for stats tracking
      and display.
  • Screenshots of UI changes
    • While uploading actively: image
    • While idling: image
  • Detailed steps to verify changes work correctly (as executed by you)
    • Unit tests added for the new module and its usage inside the existing uploader.py module
    • Manual verification (see screenshots above)

@caisq caisq requested a review from davidsoergel May 28, 2020 20:36
@caisq
Copy link
Contributor Author

caisq commented May 28, 2020

cc @GalOshri

@@ -503,7 +534,12 @@ def flush(self):

self._rpc_rate_limiter.tick()

with _request_logger(request, request.runs):
with contextlib.ExitStack() as stack:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me a bit to figure out why you did it this way :) I think it's because self._tracker might be None, so you can't just with self._tracker.scalars_tracker(...):.

I think this would be clearer, more idiomatic, and more flexible if you don't allow self._tracker to be None, but instead push the verbosity argument down into UploadTracker and handle it there (i.e., checking the condition only when it comes time to actually print something). Then all of these ExitStack things go away and you can just say

with _request_logger(request, request.runs):
    with self._tracker.scalars_tracker(self._num_values):
        ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. This increases the maximum depth of ident by 1. But I think it's worth the benefits you pointed out.

@@ -157,6 +163,9 @@ def create_experiment(self):
response = grpc_util.call_with_retries(
self._api.CreateExperiment, request
)
self._tracker = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per comment below, how about

self._tracker = upload_tracker.UploadTracker(self._verbosity)

(and all the consequences of that refactoring)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor Author

@caisq caisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

@@ -503,7 +534,12 @@ def flush(self):

self._rpc_rate_limiter.tick()

with _request_logger(request, request.runs):
with contextlib.ExitStack() as stack:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. This increases the maximum depth of ident by 1. But I think it's worth the benefits you pointed out.

@@ -157,6 +163,9 @@ def create_experiment(self):
response = grpc_util.call_with_retries(
self._api.CreateExperiment, request
)
self._tracker = (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@caisq caisq requested a review from davidsoergel June 2, 2020 23:54
Copy link
Member

@davidsoergel davidsoergel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Very thorough tests, too :)

sent_blobs += self._send_blob(
blob_sequence_id, seq_index, blob
)
if self._tracker:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this condition is no longer needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -594,6 +622,7 @@ def __init__(
rpc_rate_limiter,
max_request_size,
max_tensor_point_size,
tracker=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't default to None, now that this is required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Same for another line like this.

api,
rpc_rate_limiter,
max_request_size,
tracker=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't default to None, now that this is required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor Author

@caisq caisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

api,
rpc_rate_limiter,
max_request_size,
tracker=None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -594,6 +622,7 @@ def __init__(
rpc_rate_limiter,
max_request_size,
max_tensor_point_size,
tracker=None,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Same for another line like this.

@caisq
Copy link
Contributor Author

caisq commented Jun 4, 2020

I've removed the usage of tqdm, because of two reasons

  1. The multi-progress-bar setup used previously in this PR doesn't work well in notebooks, as I discovered through manual testing. So I've reverted to a sys.stdout writing and flushing only approach.
  2. We're currently not using the main progress-bar feature of tqdm. So it seems to be prudent to avoid adding this new dependency for now.

@caisq caisq merged commit 67801ef into tensorflow:master Jun 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants